CN113254824B

CN113254824B - Content determination method, device, medium, and program product

Info

Publication number: CN113254824B
Application number: CN202110533745.4A
Authority: CN
Inventors: 陈旭
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-05-14
Filing date: 2021-05-14
Publication date: 2024-04-19
Anticipated expiration: 2041-05-14
Also published as: CN113254824A

Abstract

The present disclosure relates to a content determination method, apparatus, medium, and program product, and relates to the field of artificial intelligence such as knowledge graph, deep learning, natural language processing, and the like. One embodiment of the method comprises the following steps: acquiring content demand information; matching the content demand information with attribute tags included in each content in a preset content library to obtain matched candidate content sets, wherein the candidate content sets comprise quality coefficients corresponding to each content; and determining the contents meeting the preset quality coefficient in the candidate content set as target contents corresponding to the content demand information.

Description

Content determination method, device, medium, and program product

Technical Field

The embodiment of the disclosure relates to the field of computers, in particular to the field of artificial intelligence such as knowledge graph, deep learning, natural language processing and the like, and particularly relates to a content determining method, device, medium and program product.

Background

The information flow has good conversion effect, the content is important, and a good content creative not only enables people to have a clicking desire, but also can write into the mind of the demander so as to promote low-cost conversion.

The content is currently obtained based on the following methods: (1) methodology: the sub-industries sort out high-quality content to write methodology and sort out video courses. The basic skills include: the content is obtained by using digital, analog notices, special symbols and the like. (2) complete filling: and manually summarizing high-quality content templates aiming at different industries, and obtaining content based on personalized information. (3) industry case: the content creative with the highest click rate is selected in the sub-industry, and is allowed by the authority of the demander and displayed for reference.

Disclosure of Invention

The embodiment of the disclosure provides a content determination method, device, medium and program product.

In a first aspect, an embodiment of the present disclosure proposes a content determining method, including: acquiring content demand information; matching the content demand information with attribute tags included in each content in a preset content library to obtain matched candidate content sets, wherein the candidate content sets comprise quality coefficients corresponding to each content; and determining the contents meeting the preset quality coefficient in the candidate content set as target contents corresponding to the content demand information.

In a second aspect, an embodiment of the present disclosure proposes a content determining apparatus including: an information acquisition unit configured to acquire content demand information; the content matching unit is configured to match the content demand information with attribute tags included in each content in a preset content library to obtain matched candidate content sets, wherein the candidate content sets comprise quality coefficients corresponding to each content; and a content determining unit configured to determine, as target content corresponding to the content demand information, content satisfying a preset quality coefficient in the candidate content set.

In a third aspect, an embodiment of the present disclosure proposes an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described in the first aspect.

In a fourth aspect, embodiments of the present disclosure provide a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform a method as described in the first aspect.

In a fifth aspect, embodiments of the present disclosure propose a computer program product comprising a computer program which, when executed by a processor, implements a method as described in the first aspect.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

Other features, objects and advantages of the present disclosure will become more apparent upon reading of the detailed description of non-limiting embodiments made with reference to the following drawings. The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is an exemplary system architecture diagram to which the present disclosure may be applied;

FIG. 2 is a flow chart of one embodiment of a method of determining content according to the present disclosure;

FIG. 3 is a schematic diagram of generating a library of preset content according to the present disclosure;

FIG. 4 is a flow chart of one embodiment of a method of determining content according to the present disclosure;

FIG. 5 is an application scenario diagram of a content determination method according to the present disclosure;

FIG. 6 is a schematic diagram of the structure of one embodiment of a content determining apparatus according to the present disclosure;

Fig. 7 is a block diagram of an electronic device used to implement an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

It should be noted that, without conflict, the embodiments of the present disclosure and features of the embodiments may be combined with each other. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Fig. 1 illustrates an exemplary system architecture 100 in which embodiments of the content determining method or content determining apparatus of the present disclosure may be applied.

As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

A user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or transmit video frames or the like. Various client applications, intelligent interactive applications, such as software for processing contents, software for processing information, etc., may be installed on the terminal devices 101, 102, 103.

The terminal devices 101, 102, 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, the terminal devices may be electronic products that interact with a user through one or more modes of a keyboard, a touch pad, a touch screen, a remote controller, a voice interaction or a handwriting device, such as a PC (Personal Computer ), a mobile phone, a smart phone, a PDA (personal DIGITAL ASSISTANT, a personal digital assistant), a wearable device, a PPC (pocket PC), a tablet computer, a smart car machine, a smart television, a smart speaker, a tablet computer, a laptop portable computer, a desktop computer, and so on. When the terminal devices 101, 102, 103 are software, they can be installed in the above-described electronic devices. Which may be implemented as a plurality of software or software modules, or as a single software or software module. The present invention is not particularly limited herein.

The server 105 may provide various services. For example, the server 105 may analyze and process videos displayed on the terminal devices 101, 102, 103 and generate processing results.

The server 105 may be hardware or software. When the server 105 is hardware, it may be implemented as a distributed server cluster formed by a plurality of servers, or as a single server. When server 105 is software, it may be implemented as a plurality of software or software modules (e.g., to provide distributed services), or as a single software or software module. The present invention is not particularly limited herein.

It should be noted that, the content determining method provided in the embodiments of the present disclosure is generally performed by the server 105, and accordingly, the content determining apparatus is generally disposed in the server 105.

It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to fig. 2, a flow 200 of one embodiment of a method of determining content according to the present disclosure is shown. The content determination method may include the steps of:

in step 201, content demand information is obtained.

In the present embodiment, the execution subject of the content determination method (e.g., the terminal devices 101, 102, 103 or the server 105 shown in fig. 1) may be content demand information. The content requirement information may refer to some requirement information of the content of the information stream by the requirement party. The above-mentioned demander can be a user, a client, an enterprise, etc. For example, the content's smoothness, primordial degree, estimated click rate, and the like. The creative strategies already formulated can be expressed through the text mode.

In this embodiment, the content determining method further includes: and adding the content at the position of the content pushing sub-module corresponding to the content label according to the content label carried by the content, and generating the pushing content pushed to the user. When acquiring the content, the executing body may set a content tag for the acquired content, for example, may set a large title tag for the content input by the user, and then add the content with the large title tag in a large title sub-module of the content push module. The content push sub-module may include a large title sub-module, a subtitle sub-module, an active content sub-module, and the like.

In this embodiment, the display position of the content may also be set by a physical attribute tag included in the service requirement information.

In one example, the physical attribute tags include one or more of content element location, content element area ratio, content element primary color, number of words of the content, and product tags.

The content element position is the position of the content element in the content creative, the corresponding label is automatically calculated through a machine calculating element coordinate point on the creative whole graph, the area of the content element is calculated and divided by the area of the content creative whole graph to obtain the content element area occupation ratio, and the main color in the content element is identified through an image identification technology to configure the corresponding label.

It should be noted that, the respective content elements in the content creative may be automatically labeled by techniques such as image recognition and machine calculation. The physical attribute tags may represent natural objective attribute characteristics of the content elements. The method can judge the quality of the content creative by marking different labels and more dimensionalities which can be referred when the post-processing is carried out, and judge the degree of receiving the content creative by the user by comparing different setting positions of different content elements, different area occupation ratios and different colors. And when the data processing is performed later, establishing each judgment dimension, so that the most optimized scheme is selected.

In this embodiment, the content demand information may also include a humanity attribute tag, a business attribute tag, and an audience emotion tag. Wherein, the humane attribute labels represent characteristics of favorites culture and ideas, such as styles (beautiful, refreshing, cool, technological sense and the like) and creative tonality (passion, happiness, melancholy and the like) of creative designs, and label different content elements from the human perspective. Commercial attribute labels represent features such as branding, merchandise attributes, etc. Such as: brand character (sincerity, trendy, honest, etc.). Audience emotion tags: representing emotional characteristics of the audience when the creative and element are seen. Such as: like, dislike, happy, confusing, etc. The labels are added manually, so that the flexibility of labeling is improved. Through setting up corresponding manual labeling cross section, can carry out the automation to different content elements and label for the operation is more nimble and convenient.

Step 202, matching the content demand information with an attribute tag included in each content in a preset content library to obtain a matched candidate content set.

In this embodiment, the executing body may match the content requirement information with an attribute tag included in each content in a preset content library, so as to obtain an attribute tag matched with the content requirement information; and then, obtaining a candidate content library based on texts corresponding to all the matched attribute tags. The preset content library may be a pre-established content library, and the preset content library may be used to determine target content corresponding to the content demand information therefrom; each content in the preset content library may include at least one attribute tag, and the attribute tag may be a tag of a category used in classifying the content, and the attribute tag may include at least one of the following: attribute tags corresponding to element information, attribute tags corresponding to service points, attribute tags corresponding to entity information and attribute tags corresponding to target fragments.

In addition, the attribute tag can also be an attribute tag of the operation content of the user; for example, the user clicks a tab corresponding to attribute information of the content.

Correspondingly, in this example, the preset content library may be determined based on the following steps:

Firstly, acquiring original content; secondly, cleaning the original content; thirdly, determining attribute labels of each content in the cleaned content; fourth, determining the quality coefficient of each content; and fifthly, structuring output content.

In one example, in FIG. 3, (1) raw content is obtained from a content pool, e.g., information stream content, search content, content crawling, intelligently generated content; wherein the intelligently generated content may be generated by a machine translation trained content generation model. (2) And cleaning the original content through quality screening, similar filtering and the like to obtain cleaned content. (3) The attribute tags of the content can be set through the entity information identification model, the business point matching jogging shoes, the element information identification model and the target fragment model. (4) The quality coefficient of the content can be set through a content smoothness model, a content click rate estimation model, a primordial degree model and the like. (5) Structured output content, e.g. content "brand A and B reduced-! Hand slow no ", entity information is brand B, element information is cheap, target fragment is hand slow no-! ", the estimated click rate (Click Through Ratectr, CTR)" 0.05".

And 203, determining the contents meeting the preset quality coefficient in the candidate content set as target contents corresponding to the content demand information.

In this embodiment, the execution body may determine, as the target content corresponding to the content demand information, the content satisfying the preset quality coefficient in the candidate content set. The above quality coefficient may be used to evaluate the quality of each of the candidate contents.

The content determining method provided by the embodiment of the disclosure includes the steps of firstly, acquiring content demand information; then matching the content demand information with attribute tags included in each content in a preset content library to obtain matched candidate content sets, wherein the candidate content sets comprise quality coefficients corresponding to each content; and finally, determining the contents meeting the preset quality coefficient in the candidate content set as target contents corresponding to the content demand information. When searching in a preset content library by utilizing content requirement information, screening the content by using attribute tags included in each content in the preset content library, and sorting the content according to quality coefficients to realize personalized recommendation, and improving the content screening efficiency based on the attribute tags and the quality coefficients.

In some optional implementations of the present embodiment, after obtaining the content requirement information, the content determining method further includes: obtaining information related to the content elements from a preset knowledge graph according to the content demand information; matching the content demand information with attribute tags included in each content in a preset content library to obtain a matched candidate content set, wherein the matching candidate content set comprises the following steps: and matching attribute tags included in each content of the information related to the content elements to obtain a candidate content set.

In the implementation manner, the service point and the entity information are utilized in advance as the entity of the knowledge graph, and the relationship between the service point and the entity information is utilized to construct the knowledge graph; when the knowledge graph is used for understanding the content demand information, the knowledge corresponding to the service point and the entity information can be obtained respectively.

In this implementation manner, the content requirement information may be respectively matched with an attribute tag included in each content in a preset content library, so as to obtain a candidate content set.

In some optional implementations of the present embodiment, the information related to the content element includes at least one of: intent word, landing page, service point, entity information.

In this implementation, the intent word may be a word in the content demand information that characterizes the demand party's intent.

In this implementation manner, the entity information may be information obtained by performing entity word segmentation on the content. The entity information may include one or more content element-related information to describe the content, such as "brand information," "contests" (e.g., brand information contests with current brand information).

In this implementation manner, the information related to the content element may be respectively matched with an attribute tag included in each content in the preset content library, so as to obtain a candidate content set.

In some optional implementations of the present embodiment, the attribute tags include at least one of: attribute labels corresponding to element information, attribute labels corresponding to service points, attribute labels corresponding to entity information and attribute labels preset by target fragments. The above element information may be information required to indicate each content in the candidate content set. For example, element information of the content may be obtained by extracting the content. The element information may include an element field and corresponding element content, and the element content may be used to describe at least one of an operation subject, an operation object, an operation type, and an operation time of an operation corresponding to the content; the element fields may include an operation subject field corresponding to a main operation body, an operation object field corresponding to an operation object, an operation type field corresponding to an operation type, and a time field corresponding to an operation time. For example, the element field is a selling point, and the element content may be "inexpensive".

In this implementation manner, the attribute tag included in each content in the preset content library may be determined based on the following model:

Entity information identification model: by adopting ERNIE (Enhanced Language Representation with Informative Entities) + conditional random field (Conditional Random Fields, CRF), the training sample can be updated based on ACTIVE LEARNING (active learning) besides the self-extraction of the demand party, and the standard calling rate can reach more than 95% when the labeling quantity reaches 500 w.

Element information classification model: and (3) constructing element information systems of different industries by adopting a machine mining and manual induction mode, and training a multi-classification model based on ERNIE. For example, more than 100 labels can reach 85% accuracy and 46% coverage of the whole library.

Target segment mining: segment identification is driven based on attention (attention mechanism), boundaries and lengths are corrected, and the segments are explicitly shown to the demander through bluing, so that the adoption of the demander is promoted. The target segment may be the highest weighted segment in the content.

Multidimensional mass fraction system: the method comprises a content smoothness model, a content click rate estimation model, a primordial degree model and the like.

In this implementation, classification of each content in the preset content library may be implemented from multiple dimensions.

In some alternative implementations of the present embodiment, the quality factor includes at least one of: quality coefficient corresponding to click rate, quality coefficient corresponding to smoothness and quality coefficient corresponding to primordial degree; wherein the click rate is generated for each content operated by the user.

In this implementation, the quality coefficient corresponding to the click rate may be: the quality of the content is measured based on the click rate of the content clicked by the user. For example, a higher click rate for a content indicates a higher popularity of the content with the user and also indicates a higher quality factor for the content.

The mass coefficient corresponding to the smoothness may be: inputting the content in the preset content library into a pre-trained smoothness network model to obtain the smoothness of each content. The smoothness network model may be used to determine a smoothness score for the content.

In one example, language model confusion (PPL, perplixity), i.e., confusion, is an indicator of the measure of language model performance. In the implementation mode, the degree of confusion can be used for quantifying the smoothness of the content, and the lower the degree of confusion of the content is, the more smooth and natural the semantics of the content are, and the higher the smoothness is; otherwise, the content has the condition of poor semantics or wrongly written words. In actual implementation, a preset calculation formula of PPL can be adopted to calculate the confusion, and then the calculated confusion of the content is weighted and summed to obtain the smoothness of the content based on the Chinese language model N-Gram.

The higher the smoothness of the content (i.e., the lower the confusion), the higher the corresponding quality coefficient.

The quality coefficient corresponding to the primordial degree may be: the quality of the content is measured based on the degree of original content of the user. For example, the higher the originality of the content corresponds, the higher the corresponding quality coefficient thereof.

When determining the quality coefficient of each content, the quality coefficient corresponding to the click rate, the quality coefficient corresponding to the smoothness and the weight coefficient corresponding to the primordial degree may be set respectively to determine and obtain the most suitable quality coefficient of each content. The weight coefficient may be based on user preferences, such as a tendency to be original, and may set a weight coefficient for a quality coefficient corresponding to the native degree higher than a weight coefficient for a quality coefficient corresponding to the click rate and a quality coefficient corresponding to the smoothness.

In this implementation, the quality coefficient may further include: screening the quality of the content based on the repetition; the degree of repetition represents the degree of repetition between different content descriptions in each content.

In one example, content in a preset content library may be filtered based on the degree of repetition; the repetition degree includes a literal repetition and a semantic repetition, wherein the literal repetition can determine whether or not to repeat between different content descriptions in each content, such as a repetition of adjacent words, a clause repetition, an attribute word repetition description, and the like, by formulating a rule. And (3) semantic repetition, namely judging that the content has a repetition problem by training word2vec word vectors if similar words or similar clauses are found.

In the implementation manner, the contents in the candidate content library are screened through the quality coefficients of the click rate, the smoothness and the primordia, so that problematic contents can be filtered, the contents with high confidence and high coverage can be reserved as final output, and the quality of the contents is ensured.

In some optional implementations of this embodiment, the similarity between the attribute tags of each content in the candidate set of content is greater than a preset similarity threshold.

In the implementation manner, after each content in the preset content is matched with the content demand information, matched candidate content is obtained; and obtaining a candidate content set based on the content with the similarity between the attribute tag of the content in the candidate content and the content demand information being greater than a preset similarity threshold.

It should be noted that the similarity threshold may be set by the user or selected according to the content determination accuracy.

In this implementation, the similarity threshold may be used to further screen the content in the preset content library, so as to determine the content from a small range.

In some optional implementations of this embodiment, determining the content in the candidate content set that satisfies the preset quality coefficient as the target content corresponding to the content demand information includes: and according to the preset user acceptance corresponding to the content, determining the content meeting the preset quality coefficient in the candidate content set as the target content corresponding to the content demand information.

Specifically, the target content in the content meeting the preset quality coefficient is screened according to the user adoption rate of the content meeting the preset quality coefficient.

It should be noted that, the preset user adoption rate may be determined by a pre-trained adoption rate model, and when each content in the preset content library is recommended to the demander, each content may be input into the adoption rate model in advance, so as to obtain the adoption rate of each content. For example, the adoption rate may be calculated according to whether the user has adopted each time, for example, recommending a certain content 10 times, and adopting the content five times by different desirers, the adoption rate of the content is 50%, and as the recommendation number increases, the adoption rate also changes. Moreover, the change in the adoption rate also accounts for the change in the content demand of the demander, and if the adoption rate is low (for example, below a certain threshold value), the recommendation strategy needs to be adjusted, for example, the similarity threshold value is increased, so as to minimize the recommendation of the content with the low adoption rate to the demander.

In this implementation manner, the content may be further filtered according to a user adoption rate of each content in the preset content library.

In some optional implementations of the present embodiments, the content requirement information may further include requirements for wild cards, number of characters, and the like; the content determination method further includes: and screening the content in the preset content library through element information, wild cards, character numbers and the like to obtain target content matched with the content demand information.

In some optional implementations of the present embodiment, after determining the target content, a recommendation reason for the target content, for example, a recommendation reason for ctr high, may also be displayed.

With further reference to fig. 4, fig. 4 illustrates a flow 400 of one embodiment of a method of determining content according to the present disclosure. The content determination method may include the steps of:

In step 401, content demand information is obtained.

And step 402, obtaining information related to the content elements from a preset knowledge graph according to the content demand information.

In this embodiment, the execution subject of the content determining method (for example, the terminal devices 101, 102, 103 or the server 105 shown in fig. 1) may obtain information related to the content element from a preset knowledge graph according to the content demand information.

And step 403, matching the content demand information with attribute tags included in each content in a preset content library to obtain a matched candidate content set.

And step 404, determining the contents meeting the preset quality coefficient in the candidate content set as target contents corresponding to the content demand information.

In this embodiment, the specific operations of steps 401 and 404 are described in detail in steps 401 and 403, respectively, in the embodiment shown in fig. 2, and are not described herein.

As can be seen from fig. 4, the content determining method in the present embodiment highlights the step of acquiring information related to content elements in the content demand information based on the knowledge-graph, compared with the embodiment corresponding to fig. 2. Thus, the solution described in this embodiment obtains information related to the content element corresponding to the content demand information based on the knowledge graph; and then, matching the information related to the content elements with attribute tags included in each content in a preset content library to obtain a candidate content set.

With further reference to fig. 5, fig. 5 illustrates one application scenario of a content determination method according to the present disclosure. In the application scenario, the content determination method may include the steps of:

first, obtaining original content, and cleaning the original content.

And secondly, determining attribute labels and quality coefficients of the cleaned content.

In this embodiment, the attribute tag may include element information, service points, entity information, target segments, and the like. The quality coefficients may include estimated ctr, smoothness, and natively.

And thirdly, acquiring content demand information.

And fourthly, obtaining information related to the content elements from a preset knowledge graph according to the content demand information.

In this embodiment, the information related to the content element may include: intent words, content, landing pages, service points, entity information, etc.

And fifthly, generating a candidate content set.

And sixthly, controlling the correlation.

In this embodiment, the relevance control may be to screen the content by setting a similarity threshold between the content in the preset content library.

And seventh, sequencing.

In this embodiment, the contents filtered based on the similarity threshold may be ranked to determine the target content corresponding to the content demand information. Or, the ordered content is further filtered based on the user adoption rate of the content to obtain target content corresponding to the content demand information.

With further reference to fig. 6, as an implementation of the method shown in the foregoing figures, the present disclosure provides an embodiment of a content determining apparatus, which corresponds to the method embodiment shown in fig. 2, and which is particularly applicable to various electronic devices.

As shown in fig. 6, the content determining apparatus 600 of the present embodiment may include: an information acquisition unit 601, a content matching unit 602, and a content determination unit 603. Wherein, the information acquisition unit is configured to acquire the content demand information; the content matching unit is configured to match the content demand information with attribute tags included in each content in a preset content library to obtain matched candidate content sets, wherein the candidate content sets comprise quality coefficients corresponding to each content; and a content determining unit configured to determine, as target content corresponding to the content demand information, content satisfying a preset quality coefficient in the candidate content set.

In the present embodiment, in the content determining apparatus 600: the specific processes of the information obtaining unit 601, the content matching unit 602, and the content determining unit 603 and the technical effects thereof may refer to the relevant descriptions of steps 201 to 203 in the corresponding embodiment of fig. 2, and are not described herein.

In some optional implementations of the present embodiment, after acquiring the content requirement information, the content determining apparatus further includes: the information obtaining unit is configured to obtain information related to the content elements from a preset knowledge graph according to the content demand information; the content matching unit 602 is further configured to: and matching attribute tags included in each content of the information related to the content elements to obtain a candidate content set.

In some optional implementations of the present embodiment, the attribute tags include at least one of: attribute tags corresponding to element information, attribute tags corresponding to service points, attribute tags corresponding to entity information and attribute tags corresponding to target fragments.

In some optional implementations of this embodiment, the similarity between the attribute tags of each content in the candidate content set is greater than a preset similarity threshold.

In some optional implementations of the present embodiment, the content determining unit 603 is further configured to: and according to the preset user acceptance corresponding to the content, determining the content meeting the preset quality coefficient in the candidate content set as the target content corresponding to the content demand information.

According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.

Fig. 7 illustrates a schematic block diagram of an example electronic device 700 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 7, the apparatus 700 includes a computing unit 701 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 702 or a computer program loaded from a storage unit 70A into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the device 700 may also be stored. The computing unit 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

Various components in device 700 are connected to I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, etc.; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 70A such as a magnetic disk, an optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.

The computing unit 701 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 701 performs the respective methods and processes described above, for example, the content determination method. For example, in some embodiments, the content determination method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 70A. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 700 via ROM 702 and/or communication unit 709. When a computer program is loaded into the RAM 703 and executed by the computing unit 701, one or more steps of the content determination method described above may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured to perform the content determination method by any other suitable means (e.g. by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

Artificial intelligence is the discipline of studying computers to simulate certain mental processes and intelligent behaviors (e.g., learning, reasoning, thinking, planning, etc.) of humans, both hardware-level and software-level techniques. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural voice processing technology, a machine learning/deep learning technology, a big data processing technology, a knowledge graph technology and the like.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions mentioned in the present disclosure are achieved, and are not limited herein.

The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. A content determination method, comprising:

constructing a content library, wherein each content in the content library corresponds to an attribute tag and a quality coefficient, and the quality coefficient comprises a quality coefficient corresponding to the repeatability of the degree of repetition between each content and other content;

Acquiring content demand information;

Matching the content demand information with attribute tags included in each content in the content library to obtain a matched candidate content set, wherein the similarity between the attribute tags of each content in the candidate content set is larger than a preset similarity threshold;

Determining contents meeting a preset quality coefficient in the candidate content set as target contents corresponding to the content demand information, wherein the contents in the candidate content set are filtered based on the repetition degree, and the contents meeting the preset quality coefficient in the candidate content set are determined as target contents corresponding to the content demand information according to the preset user acceptance degree corresponding to the contents;

the construction of the content library comprises the following steps:

Acquiring original content based on at least one of information stream content, search content, capture content and intelligent generation content, and cleaning the acquired original content;

and determining the attribute label and the quality coefficient of each content in the content obtained after cleaning.

2. The method of claim 1, wherein after obtaining content demand information, the method further comprises:

Obtaining information related to the content elements from a preset knowledge graph according to the content demand information;

the matching the content requirement information with the attribute tags included in each content in the content library to obtain a matched candidate content set includes:

And matching the information related to the content elements with attribute tags included in each content to obtain the candidate content set.

3. The method of claim 2, wherein the information related to content elements comprises at least one of: intent word, landing page, service point, entity information.

4. A method according to any of claims 1-3, wherein the attribute tags comprise at least one of: attribute tags corresponding to element information, attribute tags corresponding to service points, attribute tags corresponding to entity information and attribute tags corresponding to target fragments.

5. A method according to any one of claims 1-3, wherein the mass coefficient further comprises at least one of: quality coefficient corresponding to click rate, quality coefficient corresponding to smoothness and quality coefficient corresponding to primordial degree; wherein the click rate is generated for a user to manipulate each of the content.

6. A content determination apparatus comprising:

a content library construction unit configured to construct a content library, each content in the content library corresponding to an attribute tag and a quality coefficient, wherein the quality coefficient includes a quality coefficient corresponding to a degree of repetition representing a degree of repetition between each content and other content;

An information acquisition unit configured to acquire content demand information;

A content matching unit configured to match the content requirement information with attribute tags included in each content in the content library to obtain a matched candidate content set, wherein a similarity between the attribute tags of each content in the candidate content set is greater than a preset similarity threshold;

A content determining unit configured to determine contents of the candidate content set satisfying a preset quality coefficient as target contents corresponding to the content demand information, wherein the content determining unit includes filtering the contents of the candidate content set based on a repetition degree, and determining the contents of the candidate content set satisfying the preset quality coefficient as target contents corresponding to the content demand information according to a preset user acceptance degree corresponding to the contents;

wherein, the constructing the content library comprises:

7. The apparatus of claim 6, wherein after obtaining content demand information, the apparatus further comprises:

The information obtaining unit is configured to obtain information related to the content elements from a preset knowledge graph according to the content demand information;

The content matching unit is further configured to: and matching the information related to the content elements with attribute tags included in each content to obtain the candidate content set.

8. The apparatus of claim 7, wherein the information related to content elements comprises at least one of: intent word, landing page, service point, entity information.

9. The apparatus of any of claims 6-8, wherein the attribute tag comprises at least one of: attribute tags corresponding to element information, attribute tags corresponding to service points, attribute tags corresponding to entity information and attribute tags corresponding to target fragments.

10. The apparatus of any of claims 6-8, wherein the mass coefficient further comprises at least one of: quality coefficient corresponding to click rate, quality coefficient corresponding to smoothness and quality coefficient corresponding to primordial degree; wherein the click rate is generated for a user to manipulate each of the content.

11. An electronic device, comprising:

At least one processor; and

A memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.

12. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-5.

13. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any of claims 1-5.