CN110347921B

CN110347921B - Label extraction method and device for multi-mode data information

Info

Publication number: CN110347921B
Application number: CN201910597699.7A
Authority: CN
Inventors: 林会杰; 石子晶; 刘硙; 叶松鹤
Original assignee: Belight Innovation Beijing Information Technology Co ltd
Current assignee: Belight Innovation Beijing Information Technology Co ltd
Priority date: 2019-07-04
Filing date: 2019-07-04
Publication date: 2022-04-19
Anticipated expiration: 2039-07-04
Also published as: CN110347921A

Abstract

The invention discloses a method and a device for extracting labels of multi-modal data information, which relate to the technical field of data processing, wherein the label information extracted from published content information is helpful for ensuring the accuracy of automatic demand matching operation of subsequent execution, and the main technical scheme of the invention is as follows: receiving content information issued by a user, wherein the content information comprises multi-mode data information; converting the content information issued by the user into text information; and processing the text information by using a preset label extraction model to obtain label information corresponding to the text information. The method and the system are mainly used for extracting the tag information from the published content information containing the multi-mode data information.

Description

Label extraction method and device for multi-mode data information

Technical Field

The invention relates to the technical field of data processing, in particular to a method and a device for extracting a label of multi-mode data information.

Background

With the continuous innovative development of science and technology, a demand matching platform is created, users can actively release demand information to wait for feedback of users with the same demand, and release information, forward question and answer information and the like uploaded by other people can be searched to serve as references for solving the demands of the users, so that demand matching between an information releasing party and an information demanding party can be achieved by means of the network platform.

At present, in addition to the above-mentioned manner of actively releasing and manually searching by a user, the result information meeting the requirement matching is obtained, on this requirement matching platform, a large amount of information released to the platform is automatically subjected to requirement matching processing, so that the platform recommends the result information meeting the requirement matching to the user. For example, the tag information is extracted from the published text information, and the tag information summarizes the required content published by the user, so that the required content summarized by different tag information can be matched by comparing a large amount of different tag information, and the purpose of automatically matching the required content published by different users is achieved.

However, the release information received on the demand matching platform may not only include text information, but also include picture information, video information, and the like, so if the tag information is extracted from the text information only, the release demand content is not summarized comprehensively, which results in that the automatic demand matching operation performed subsequently by the platform is not accurate enough.

Disclosure of Invention

In view of this, the present invention provides a method and an apparatus for extracting labels from multimodal data information, and a main object of the present invention is to extract label information corresponding to release content information from multimodal data information during a process of performing demand matching on received release information by using a network platform, so that the obtained label information can fully summarize the release demand content, which is helpful for ensuring accuracy of subsequently executed automatic demand matching operation.

In order to achieve the above purpose, the present invention mainly provides the following technical solutions:

in a first aspect, the present invention provides a method for extracting tags from multimodal data information, the method comprising:

receiving content information issued by a user, wherein the content information comprises multi-mode data information;

converting the content information issued by the user into text information;

and processing the text information by using a preset label extraction model to obtain label information corresponding to the text information.

Optionally, the processing the text information by using a preset tag extraction model to obtain tag information corresponding to the text information includes:

processing the text information by using a preset topic model, and judging the information category to which the text information belongs;

searching a preset template matched with the text information according to the information category to which the text information belongs, wherein the preset template comprises a plurality of information dimensions, is created in advance and corresponds to the specified information category;

analyzing the text information according to the information dimensions to obtain content information under each information dimension;

and forming the content information under each information dimension into label information corresponding to the text information.

Optionally, the analyzing the text information according to the information dimensions to obtain content information under each information dimension includes:

performing word segmentation processing on the text information in combination with syntactic analysis;

extracting keyword information from the text information after word segmentation;

and matching the keyword information with the information dimensions to obtain the keyword information under each information dimension.

Optionally, the converting the content information published by the user into text information includes:

and processing the content information issued by the user by using an image description algorithm, and outputting the converted text information.

In a second aspect, the present invention further provides a tag extraction apparatus for multimodal data information, the apparatus comprising:

the receiving unit is used for receiving content information issued by a user, and the content information comprises multi-mode data information;

the conversion unit is used for converting the content information which is received by the receiving unit and is issued by the user into text information;

and the processing unit is used for processing the text information obtained by the conversion unit by using a preset label extraction model to obtain label information corresponding to the text information.

Optionally, the conversion unit includes:

the judging module is used for processing the text information by utilizing a preset theme model and judging the information category to which the text information belongs;

the searching module is used for searching a preset template matched with the text information according to the information category to which the text information belongs, wherein the preset template comprises a plurality of information dimensions, is created in advance and corresponds to the specified information category;

the analysis module is used for analyzing the text information according to the information dimensions to obtain content information under each information dimension;

and the composition module is used for composing the content information under each information dimension into label information corresponding to the text information.

Optionally, the parsing module includes:

the word segmentation sub-module is used for performing word segmentation processing on the text information in combination with syntactic analysis;

the extraction submodule is used for extracting keyword information from the text information after word segmentation;

and the matching submodule is used for matching the keyword information with the information dimensions to obtain the keyword information under each information dimension.

Optionally, the conversion unit includes:

and the processing module is used for processing the content information issued by the user by using an image description algorithm and outputting the converted text information.

In a third aspect, the present invention further provides a storage medium, where the storage medium includes a stored program, and when the program runs, the storage medium controls a device where the storage medium is located to execute the above-mentioned label extraction method for multimodal data information.

In a fourth aspect, the present invention further provides a processor, where the processor is configured to execute a program, where the program executes the above method for extracting a tag from multi-modal data information.

By the technical scheme, the technical scheme provided by the invention at least has the following advantages:

the invention provides a method and a device for extracting labels of multi-modal data information. Compared with the prior art, the method overcomes the defect that the prior method only extracts the text label information, and the label information is extracted from the multi-modal data information, so that the obtained label information can fully summarize the published requirement content, and the accuracy of the subsequent automatic requirement matching operation is ensured.

The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

fig. 1 is a flowchart of a method for extracting labels from multi-modal data information according to an embodiment of the present invention;

FIG. 2 is a flow chart of another method for extracting tags from multi-modal data information according to an embodiment of the present invention;

FIG. 3 is a block diagram of a tag extraction apparatus for multi-modal data information according to an embodiment of the present invention;

fig. 4 is a block diagram of another tag extraction apparatus for multimodal data information according to an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.

The embodiment of the invention provides a label extraction method of multi-modal data information, as shown in fig. 1, the method extracts label information from release content information consisting of the multi-modal data information on a built requirement matching platform so as to ensure the accuracy of requirement matching operation when the subsequent requirement matching operation is executed by using the extracted label information, and the embodiment of the invention provides the following specific steps:

101. content information published by a user is received, the content information comprising multimodal data information.

It should be noted that the application scenarios of the embodiment of the present invention are as follows: and establishing a demand matching platform, wherein the published content information comprises but is not limited to questions and answers, current comments, project exchange, popular science articles and the like.

The release content information contains multi-modal data information, which means that the type of the release content information is various, such as: text information, picture information, video information, and the like. Thus, with the embodiment of the present invention, the distribution content information contains at least the above-described two types of data information.

In the embodiment of the invention, a user can publish a large amount of information to the requirement matching platform, and after the platform receives the large amount of information, the platform can automatically perform requirement matching processing on the large amount of information, so that the platform recommends result information meeting requirement matching to the user.

102. And converting the content information published by the user into text information.

In the embodiment of the present invention, the requirement matching platform performs a multi-mode information acquisition operation on the received release content information, and the obtained release content information may be: text information, text plus picture information, text plus video information, and so forth.

When the published content information contains multi-modal data information, the corresponding processing method is as follows: non-text information is converted to text information, such as: for the release content information including the text plus picture information, the content information is obtained from the picture, and the content information and the original text information are summarized to obtain the text information corresponding to the release content information (that is, the release content information including the text plus picture information). Further, in the case where the distribution content information includes video information, since the video information is composed of a plurality of continuous single-frame images, the content information can be acquired from the video information by the same method.

103. And processing the text information by using a preset label extraction model to obtain label information corresponding to the text information.

In the embodiment of the invention, a label extraction model can be trained in advance by using a large amount of labeled text information, and after the release content information containing multi-mode data information is converted into the text information, the obtained text information can be processed by using the label extraction model, and the label information extracted from the text information, namely the label information corresponding to the release content information, is output.

Further, in the embodiment of the present invention, the tag information is structured data information composed of multidimensional information, such as: a tag information corresponding to the distribution content information includes: event topic, publication time, demand object characteristics, demand event attributes, and the like. The tag information corresponding to one piece of release content information is enough to comprehensively summarize the release demand content, so that the accuracy of the demand matching operation is ensured when the subsequent demand matching operation is performed by using the extracted tag information.

The embodiment of the invention provides a method for extracting labels of multi-modal data information. Compared with the prior art, the method and the device for extracting the label information make up the defect that the existing method only extracts the text label information, and the embodiment of the invention extracts the label information from the multi-modal data information, so that the obtained label information can fully summarize the published demand content, and the accuracy of the automatic demand matching operation of subsequent execution is ensured.

In order to describe the above embodiment in more detail, another method for extracting tags from multimodal data information is further provided in the embodiment of the present invention, as shown in fig. 2, the method further provides a specific implementation method for extracting tag information from text information corresponding to published content information by using a preset tag extraction model, and for this, the following specific steps are provided in the embodiment of the present invention:

201. content information published by a user is received, the content information comprising multimodal data information.

In the embodiment of the present invention, for the statement of this step, refer to step 101, and will not be described herein again.

202. And converting the content information published by the user into text information.

In the embodiment of the present invention, for the statement of this step, refer to step 102, which is not described herein again.

Further, the following step 203-:

203. and processing the text information by using a preset topic model, and judging the information type to which the text information belongs.

Wherein, for the text information processed by the preset topic model, the text information is data information obtained after the post content information containing the multi-modal data information is converted.

In the embodiment of the present invention, a topic model may be trained in advance by using a large amount of labeled text information, where the topic model is used to determine the information category to which the text information belongs, such as: the text information is: "do friend company want to find several online training companies, train employees online for some real estate knowledge or some general courses, and know which company is good for which alumni? "the text message is processed by the topic model, and the output content message issued by the user belongs to the category of finding people".

For the embodiment of the invention, the text information is processed in advance by utilizing the trained topic model to obtain the information category to which the text information belongs so as to narrow the subsequent processing range of the text information, and the text information of different information categories contains different focus contents, so that after the information category is determined, the text information is more convenient to analyze in a manner of focusing (for example, the focus contents of the category of the 'person finding').

204. And searching a preset template matched with the text information according to the information category to which the text information belongs, wherein the preset template comprises a plurality of information dimensions, is created in advance and corresponds to the specified information category.

In the embodiment of the present invention, since the text information of different information categories contains different focus contents, different templates are edited in advance for different information categories, and each template information contains information dimensions corresponding to the information categories, such as: following the text information listed above, it has been determined that the information belongs to the "finding person" category, and then corresponding to this information category, the necessary information dimensions are required as follows: the searched object, the object attribute, the service provided by the object, and the like, and specifically, the information dimension of the object attribute can be further refined into a plurality of information dimensions, such as "enterprise registration time", "company category", "main business", and the like of the object. For the embodiment of the invention, the preset template is equivalent to a service for customizing and providing information dimensionality for one information category, so the information dimensionality contained in the preset template can be written in advance according to requirements.

For the embodiment of the invention, after the information category to which the text information belongs is judged, the corresponding preset template is searched according to the determined information category, and a plurality of information dimensions contained in the preset template are obtained.

205. And analyzing the text information according to the information dimension to obtain the content information under each information dimension.

In the embodiment of the invention, the text information is analyzed according to the information dimensions contained in the preset template, that is, the content information contained in the text is matched with a plurality of information dimensions to obtain the content information in each information dimension.

Specifically, the step flow may include:

firstly, word segmentation processing is carried out on text information by combining with syntactic analysis, so that analysis can be carried out according to the main meaning object-predicate complement of the text, useless words irrelevant to semantics are deleted while word segmentation is carried out, and words with actual meanings are reserved.

And secondly, extracting keyword information from the text information subjected to word segmentation processing, and matching the keyword information with information dimensions to obtain the keyword information under each information dimension. Therefore, the method is equivalent to screening the characters of a large paragraph, only enough keyword information for representing semantics is reserved and matched to the specified information dimension, and the content information under each information dimension is simplified.

206. And forming the content information under each information dimension into label information corresponding to the text information.

In the embodiment of the present invention, the tag information is structured data information composed of multidimensional information, such as: a tag information corresponding to the distribution content information includes: event topic, publication time, demand object characteristics, demand event attributes, and the like. The content information composed of the keyword information in each information dimension obtained through the

above step

201 and 205 is enough to comprehensively summarize the published demand content, so that when the subsequent demand matching operation is performed by using the extracted tag information, the accuracy of the demand matching operation is ensured.

Further, as an implementation of the methods shown in fig. 1 and fig. 2, an embodiment of the present invention provides a tag extraction apparatus for multimodal data information. The embodiment of the apparatus corresponds to the embodiment of the method, and for convenience of reading, details in the embodiment of the apparatus are not repeated one by one, but it should be clear that the apparatus in the embodiment can correspondingly implement all the contents in the embodiment of the method. The device is applied to extracting tag information from release content information containing multi-modal data information, and particularly as shown in figure 3, the device comprises:

a receiving unit 31, configured to receive content information issued by a user, where the content information includes multi-modal data information;

a conversion unit 32, configured to convert the content information issued by the user and received by the receiving unit 31 into text information;

the processing unit 33 is configured to process the text information obtained by the converting unit 32 by using a preset label extraction model, so as to obtain label information corresponding to the text information.

Further, as shown in fig. 4, the processing unit 33 includes:

the judging module 331 is configured to judge an information category to which the text information belongs by processing the text information using a preset topic model;

a searching module 332, configured to search, according to the information category to which the text information belongs, a preset template matched with the text information, where the preset template includes multiple information dimensions, is created in advance, and corresponds to a specified information category;

the parsing module 333 is configured to parse the text information according to the information dimensions to obtain content information in each information dimension;

a composing module 334, configured to compose the content information in each information dimension into tag information corresponding to the text information.

Further, as shown in fig. 4, the parsing module 333 includes:

the word segmentation sub-module 3331 is used for performing word segmentation processing on the text information in combination with syntactic analysis;

the extraction sub-module 3332 is configured to extract keyword information from the text information after the word segmentation processing;

and the matching sub-module 3333 is configured to match the keyword information with the information dimensions to obtain the keyword information in each information dimension.

Further, as shown in fig. 4, the conversion unit 32 includes:

the processing module 321 is configured to process the content information issued by the user by using an image description algorithm, and output the converted text information.

The embodiment of the invention also provides a storage medium, which comprises a stored program, wherein when the program runs, the device where the storage medium is located is controlled to execute the tag extraction method of the multi-modal data information.

The embodiment of the invention also provides a processor, which is used for running the program, wherein the label extraction method of the multi-modal data information is executed when the program runs.

The embodiment of the invention provides a method and a device for extracting labels of multi-modal data information. Compared with the prior art, the method and the device for extracting the label information make up the defect that the existing method only extracts the text label information, and the embodiment of the invention extracts the label information from the multi-modal data information, so that the obtained label information can fully summarize the published demand content, and the accuracy of the automatic demand matching operation of subsequent execution is ensured.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). The memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include transitory computer readable media (trahsity media) such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.

The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A method for extracting labels of multi-modal data information, the method comprising:

receiving content information issued by a user on a demand matching platform, wherein the content information comprises multi-mode data information, and the multi-mode data information at least comprises text information, text and picture information and text and video information;

converting the content information issued by the user into text information, specifically comprising: converting the non-text information into text information, and summarizing the text information with the original text information to obtain the text information corresponding to the issued content information;

processing the text information by using a preset label extraction model to obtain label information corresponding to the text information, wherein the label information comprises: processing the text information by using a preset topic model, and judging the information category to which the text information belongs; searching a preset template matched with the text information according to the information category to which the text information belongs, wherein the preset template comprises a plurality of information dimensions, is created in advance and corresponds to the specified information category; analyzing the text information according to the information dimensions to obtain content information under each information dimension; forming the content information under each information dimension into label information corresponding to the text information;

analyzing the text information according to the information dimensions to obtain content information under each information dimension, wherein the analyzing comprises: performing word segmentation processing on the text information in combination with syntactic analysis; extracting keyword information from the text information after word segmentation; and matching the keyword information with the information dimensions to obtain the keyword information under each information dimension.

2. The method according to claim 1, wherein the converting the content information published by the user into text information comprises:

3. A tag extraction apparatus for multimodal data information, the apparatus comprising:

the receiving unit is used for receiving content information issued by a user on the requirement matching platform, and the content information comprises multi-mode data information;

the conversion unit is used for converting the content information which is received by the receiving unit and issued by the user into text information, and the multi-mode data information at least comprises text information, text and picture information and text and video information;

the processing unit is configured to process the text information obtained by the conversion unit by using a preset label extraction model to obtain label information corresponding to the text information, and specifically includes: converting the non-text information into text information, and summarizing the text information with the original text information to obtain the text information corresponding to the issued content information;

the conversion unit includes:

the composition module is used for composing the content information under each information dimension into label information corresponding to the text information;

wherein the parsing module comprises: the word segmentation sub-module is used for performing word segmentation processing on the text information in combination with syntactic analysis; the extraction submodule is used for extracting keyword information from the text information after word segmentation; and the matching submodule is used for matching the keyword information with the information dimensions to obtain the keyword information under each information dimension.

4. The apparatus of claim 3, wherein the conversion unit comprises:

5. A storage medium characterized by comprising a stored program, wherein a device on which the storage medium is located is controlled to execute the tag extraction method of the multimodal data information according to claim 1 or 2 when the program runs.

6. A processor, characterized in that the processor is configured to execute a program, wherein the program executes the method for extracting tags from multimodal data information according to claim 1 or 2.