CN113032671A

CN113032671A - Content processing method, content processing device, electronic equipment and storage medium

Info

Publication number: CN113032671A
Application number: CN202110288139.0A
Authority: CN
Inventors: 吴广发; 薛璐影; 施鹏; 孙六英; 魏谨谦
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-03-17
Filing date: 2021-03-17
Publication date: 2021-06-25
Anticipated expiration: 2041-03-17
Also published as: CN113032671B

Abstract

The disclosure provides a content processing method and device, electronic equipment and a storage medium, and relates to the field of internet information processing. The specific implementation scheme is as follows: determining M knowledge fields related to N contents based on the labels of the N contents produced by the target producer; determining the domain concentration ratios of the N contents based on the number of the contents respectively corresponding to the M knowledge domains in the N contents; based on the domain concentrations of the N contents, a recommendation level of the target producer is determined. The disclosed embodiments facilitate accurate identification of the production capacity of a target producer.

Description

Content processing method, content processing device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of data processing technology, and in particular, to the field of internet information processing.

Background

Thanks to the rapid development of the internet, knowledge dissemination is evolving. At present, sharing knowledge in the internet knowledge community becomes a mainstream knowledge propagation mode. By evaluating the level of the content producer and quantifying the production capacity of each content producer in the knowledge community, the production of more high-quality contents can be promoted, and the vigorous development of the knowledge community is kept. Therefore, how to accurately identify the production capacity of the content producer is a hot issue of knowledge community operation.

Disclosure of Invention

The disclosure provides a content processing method, a content processing apparatus, an electronic device, and a storage medium.

According to an aspect of the present disclosure, there is provided a content processing method including:

determining M knowledge fields related to N contents based on the labels of the N contents produced by the target producer; wherein N and M are positive integers;

determining the domain concentration ratios of the N contents based on the number of the contents respectively corresponding to the M knowledge domains in the N contents;

based on the domain concentrations of the N contents, a recommendation level of the target producer is determined.

According to another aspect of the present disclosure, there is provided a content processing apparatus including:

the domain determining module is used for determining M knowledge domains related to the N contents based on the labels of the N contents produced by the target producer; wherein N and M are positive integers;

the concentration determining module is used for determining the domain concentrations of the N contents based on the number of the contents corresponding to the M knowledge domains in the N contents;

and the level determining module is used for determining the recommendation level of the target producer based on the domain concentration ratios of the N contents.

According to another aspect of the present disclosure, there is provided an electronic device including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method according to any one of the embodiments of the present disclosure.

According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform a method in any of the embodiments of the present disclosure.

According to another aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the method in any of the embodiments of the present disclosure.

According to the technical scheme, the corresponding field concentration is determined according to the label of the content produced by the target producer. And determining the recommendation level of the target producer based on the field concentration ratio, so that the recommendation level can accurately reflect the concentration degree of the target producer to the field, and the production capacity of the target producer can be accurately identified.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

fig. 1 is a schematic diagram of a content processing method provided by an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of a content processing method provided by another embodiment of the present disclosure;

fig. 3 is a schematic diagram of a content processing method according to another embodiment of the disclosure;

FIG. 4 is a schematic diagram of recommended production tasks in an example application of the present disclosure;

fig. 5 is a schematic diagram of a content processing apparatus provided by an embodiment of the present disclosure;

fig. 6 is a schematic diagram of a content processing apparatus provided by another embodiment of the present disclosure;

fig. 7 is a schematic diagram of a content processing apparatus provided by yet another embodiment of the present disclosure;

fig. 8 is a block diagram of an electronic device for implementing a content processing method of an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Fig. 1 shows a schematic diagram of a content processing method according to an embodiment of the present disclosure. As shown in fig. 1, the method includes:

step S110, determining M knowledge fields related to N contents based on the labels of the N contents produced by the target producer; wherein N and M are positive integers;

step S120, determining the domain concentration ratios of the N contents based on the quantity of the contents corresponding to the M knowledge domains in the N contents;

step S130, based on the domain concentration ratios of the N contents, the recommendation level of the target producer is determined.

In the embodiment of the disclosure, the target producer may include a content producer to be determined in a knowledge community such as a question and answer platform, an encyclopedia platform, a library platform, and the like. The content producer is, for example, a user who answers a question in a question and answer platform, a user who uploads a document in a library platform, or the like.

Illustratively, content produced by a target producer over a period of time may be periodically obtained. For example, the content produced by the target producer in the previous week is acquired at the start of each week, or the content produced by the target producer in the previous two weeks is acquired at the start of each week. Accordingly, the content to be currently processed, that is, the content that is newly acquired, may be determined as the N pieces of content. By periodically acquiring the content produced by the target producer and determining the recommendation level of the target producer based on the content, the problem of insensitivity to the variation of the production capacity of the content producer can be overcome, and the production capacity of the content producer can be accurately identified.

In the embodiments of the present disclosure, the content may refer to various carriers of information, such as knowledge documents, articles, answers to questions, and the like. In the knowledge community, tags may be determined for each content. For example, different tab sets may be set for different fields, and the tab of the content may be selected from each tab set according to the associated entity, topic, keyword, and the like of the content.

By establishing the corresponding relation between each tag and each knowledge domain, the M knowledge domains related to the N contents can be determined based on the tags of the N contents. Wherein each of the N content items may have at least one tag relating to at least one knowledge domain. And obtaining the domain related to each content and performing duplication removal, thereby obtaining M knowledge domains related to N contents.

Furthermore, the number of the contents corresponding to each knowledge domain in the N contents can be determined for each knowledge domain in the M knowledge domains, so as to obtain the distribution of the contents produced by the target producer in each knowledge domain. Based on the number of contents respectively corresponding to the M knowledge domains among the N contents, the domain concentration ratios of the N contents can be determined.

For example, the number ratio of the content corresponding to each knowledge domain of the M knowledge domains in the N content may be calculated according to the number, and the sum of the number ratios corresponding to K knowledge domains with the highest number ratio, for example, one or two knowledge domains, may be used as the domain concentration of the N content.

For example, content 1 of the 4 contents relates to domain 1, content 2 relates to domain 1, content 3 relates to domain 2, and content 4 relates to domain 3. Based on this, the fields to which the contents 1 to 4 relate include fields 1 to 3. Among the 4 pieces of content, the number of content corresponding to the domain 1 is 2, the number of content corresponding to the domain 2 is 1, and the number of content corresponding to the domain 3 is 1. It can be seen that 4 contents are mainly concentrated in domain 1. Where the content of the domain 1 accounts for 1/2 among the 4 contents, the domain concentration ratio of the 4 contents may be 1/2.

In practical applications, other ways of determining the domain concentration may be used.

Illustratively, the range of values of the domain concentration may be divided into a plurality of sub-ranges, each range corresponding to a recommendation level. For example, the range of the domain concentration is divided into two sub-ranges according to a preset threshold. If the field concentration ratio is higher than a preset threshold value, the recommendation level of the target producer is one level; and if the domain concentration ratio is lower than a preset threshold value, the recommendation level of the target producer is two levels. In practical application, more incentive resources can be released to the content producer with the recommendation level of one level so as to promote the content producer to produce more high-quality content.

Therefore, according to the method of the embodiment of the disclosure, the corresponding domain concentration is determined according to the label of the content produced by the target producer. And determining the recommendation level of the target producer based on the field concentration ratio, so that the recommendation level can accurately reflect the concentration degree of the target producer to the field, and the production capacity of the target producer can be accurately identified. Furthermore, the method is beneficial to promoting the high-quality content producer to produce more high-quality content and keeping the good ecological development of the knowledge community.

In an exemplary embodiment, as shown in fig. 2, the step S120 of determining the domain concentrations of the N contents based on the number of contents corresponding to the M knowledge domains in the N contents may include:

step S210, determining the category entropies of the N contents based on the ratio of the number of the contents respectively corresponding to the M knowledge fields in the N contents to N;

step S220, determining the domain concentration of the N contents according to the category entropies of the N contents.

For example, for the ith knowledge domain of the M knowledge domains, the number corresponding to the ith knowledge domain in the N contents is X, and the ratio of the number to N is P ═ X/N. Wherein i is an integer of 1 or more and M or less. The class entropy entrypy _ cat for the N contents may be calculated according to the following formula:

based on the above-mentioned class entropy _ cat and the following formula, the domain concentration degree domain of N contents can be determined:

where tanh () represents a hyperbolic tangent function.

In the above embodiment, the domain concentration is determined according to the category entropies of the N contents, and the objectivity and accuracy of the domain concentration can be improved, thereby improving the accuracy of the recommendation level of the target producer. The method is favorable for accurately quantifying the production capacity of the target producer.

In another exemplary embodiment, as shown in fig. 3, the step S120 of determining the domain concentrations of the N contents based on the number of contents corresponding to the M knowledge domains in the N contents may include:

step S310, determining distribution information of the target knowledge domain based on the number of contents respectively corresponding to the M knowledge domains in the N contents under the condition that the M knowledge domains include the target knowledge domain;

step S320, determining a domain concentration ratio of the N contents based on the distribution information of the target knowledge domain.

Illustratively, the target knowledge domain may be a knowledge domain having an association with the target producer. For example, the target knowledge domain may be a knowledge domain to which the target producer belongs, which is determined according to a selection operation made by the target producer, or a knowledge domain that is most adept by the target producer, which is determined according to contents produced by the target producer within a certain period of time, for example, within a half year or within a year.

For example, the distribution information of the target knowledge domain may be the number ratio of the content corresponding to the target knowledge domain among the N content, or the ranking information of the number ratio of the content corresponding to each of the M domains, or the like. For example, if the target knowledge domain of the target producer is artificial intelligence, the content of the artificial intelligence domain produced by the target producer accounts for 0.6 out of the N contents. The distribution information may be 0.6. Or, the number ratio of the contents in the artificial intelligence domain in the N contents is the maximum value of the number ratio of the contents in each domain in the M domains in the N contents, and the distribution information may be the first ranking. Correspondingly, the higher the distribution information or the higher the ranking, the higher the concentration of the domains.

The domain concentration of the N contents is determined based on the distribution information of the target knowledge domain, so that the domain concentration can more accurately reflect the concentration of the target producer to the target knowledge domain, and the accuracy of the recommendation level of the target producer is improved. The method is favorable for accurately quantifying the production capacity of the target producer.

The domain concentrations determined according to the above embodiments are used to determine the recommendation levels of the target producers. Specifically, the step S130 of determining the recommendation level of the target producer based on the domain concentration of the N contents includes:

determining a content quality score for the target producer based on the N contents and the classification model;

and determining the recommendation level of the target producer according to the domain concentration of the N contents and the content quality score of the target producer.

For example, the classification model may be used to perform quality classification on the N contents, and the quality classes of the N contents are integrated to obtain the content quality score of the target producer. And determining the recommendation level of the target producer by combining the content quality score and the field concentration.

Illustratively, the classification model may be an XGBoost (Extreme Gradient Boosting) model, which may improve the accuracy of content quality scoring.

Illustratively, the recommendation level of the target producer may be obtained based on a predetermined algorithm or model using information of multiple dimensions such as domain concentration, content quality score, and the like of N pieces of content as input information of the predetermined algorithm or model.

In practical application, the recommendation level of the target producer can be determined by combining information of other dimensions. For example, the recommendation level of the target producer may be determined according to the information of each dimension, such as the activity, the user preference, the authority, and the like of the target producer.

Wherein the activity of the target producer can be determined based on an RFM (recent-Frequency-Monetary) model. The liveness of the target producer is determined, for example, based on the quality or production time of the content that the target producer has produced most recently, the frequency with which the target producer produces the content, and the quantity of the content that the target producer produces.

The user liking of the target producer may be determined based on a GBDT (Gradient Boosting decision Tree) regression model. For example, the number of actions of browsing, praise, collecting and the like of the content produced by the target producer by each user is input into the GBDT model, and the user preference is calculated by adopting the GBDT model.

The authority of the target producer may be derived based on the certification information of the target producer.

In the technical solution of the present disclosure, the acquisition, storage, and application of the related content producer information or user information (including the content produced by the content producer and the information of each dimension) all meet the regulations of the related laws and regulations, and do not violate the good custom of the public order.

According to the above embodiment, the recommendation level of the target producer is determined by using the multidimensional information, so that the recommendation level of the target producer can be further improved, and the production capacity of the target producer can be more accurately identified.

The recommendation level determined by adopting the multi-dimensional information can be used for mining high-quality producers and perfecting a user incentive mechanism. For example, in the question and answer community, high-quality producers with high field concentration can be mined based on information of each dimension of the target producer, and marks of the high-quality producers, such as makeup dawners and sports dawners, are displayed on the client of the question and answer community. On one hand, the method can play an encouraging role, on the other hand, producers in the question and answer page can be personalized, interaction behaviors such as attention and communication among users of the knowledge community are promoted, and the activity of the knowledge community is improved.

As an example, the content processing method further includes:

and determining the recommended production task of the target producer according to the recommended level of the target producer.

For example, in the question-and-answer community, if the recommendation level of the target producer is high and the field concentration of the target producer is high, the to-be-answered list of the related field is taken as the recommended production task of the target generator and recommended to the target producer for answering so as to prompt the target producer to produce answer content. As shown in FIG. 4, on the target producer's client, a number of different types of question tabs may be included, such as a recommendation question tab, an all questions tab, a video questions tab. Wherein the to-be-answered list recommended to the target producer is displayed in the recommended question tab. Each question in the list is related to the domain to which the target producer belongs, such as food, electronics, etc.

Therefore, based on the above embodiment, the recommendation level of the target producer is used for distributing the knowledge content, so that the quality of the content produced in the knowledge community can be effectively improved. In the knowledge community, core indexes such as question answering rate and the like can be effectively improved.

In some examples, refined operation of the knowledge community may also be implemented based on information of each dimension of the target producer and the recommendation level. A variety of exemplary operations implemented according to the information of the dimensions of the target producer and the recommendation level are listed below:

(1) and monitoring the service core indexes. Such as periodically monitoring the number of content producers, the number of premium content producers, etc.

(2) Query producer drawings. For example, a user may query the content producer's portrait tag with the content producer's identification information.

(3) Screening and analyzing the producers. For example, a user may use knowledge domains, domain concentrations, content quality scores to screen out content producers of interest.

(4) And analyzing the business health degree. For example, high-quality producers produced in the process of pulling new and activating are counted by combining the data of pulling new and activating the knowledge community, so that comprehensive health degree analysis for the knowledge community is obtained, and business investigation and decision are supported.

(5) And managing the task. For example, the method supports the task of creating and managing the batch query and meets the requirement of big data analysis.

The method of the embodiment of the disclosure determines the corresponding field concentration according to the label of the content produced by the target producer. And determining the recommendation level of the target producer based on the field concentration ratio, so that the recommendation level can accurately reflect the concentration degree of the target producer to the field, and the production capacity of the target producer can be accurately identified. Furthermore, the method is beneficial to promoting the high-quality content producer to produce more high-quality content and keeping the good ecological development of the knowledge community. In practical application, the method can be realized based on a description framework. Specifically, the characterization framework may include:

the strategy execution module is used for uniformly executing the model for acquiring the information of each dimension;

the task scheduling module is used for scheduling and connecting the acquisition sequence of various information and the execution sequence of various steps in series;

the identification matching module is used for realizing the association between the information of all dimensions of the same content producer based on the unique identification of the content producer;

the model management module is used for processing the online, offline, configuration modification and the like of each model;

the data monitoring module is used for monitoring the ring ratio and the unity ratio performance of the data depended on by the models, finding out abnormality in time, reminding an operator to process and ensuring the effect of each model;

and the data updating module is used for updating each data according to the requirement.

As an implementation of the above methods, the present disclosure also provides a content processing apparatus, as shown in fig. 5, the apparatus including:

a domain determining module 510, configured to determine M knowledge domains related to the N content based on the tags of the N content produced by the target producer; wherein N and M are positive integers;

a concentration determination module 520, configured to determine a domain concentration of the N content based on the number of the content of the N content that respectively corresponds to the M knowledge domains;

a level determination module 530 for determining a recommendation level for the target producer based on the domain concentration of the N contents.

In an exemplary embodiment, as shown in fig. 6, the concentration determination module 520 includes:

a category entropy determining unit 610, configured to determine category entropies of the N pieces of content based on a ratio of the number of pieces of content of the N pieces of content, which respectively correspond to the M knowledge fields, to N;

a first concentration determining unit 620, configured to determine the domain concentrations of the N contents according to the category entropies of the N contents.

In an exemplary embodiment, as shown in fig. 7, the concentration determination module 520 includes:

a distribution determining unit 720, configured to determine distribution information of the target knowledge domain based on the number of contents corresponding to the M knowledge domains, respectively, among the N contents, when the M knowledge domains include the target knowledge domain;

a second concentration determination unit 730, configured to determine the domain concentrations of the N contents based on the distribution information of the target knowledge domain.

Exemplarily, as shown in fig. 7, the apparatus further includes:

and the task recommending module 710 is used for determining the recommended production task of the target producer according to the recommended level of the target producer.

Illustratively, as shown in fig. 7, the level determining module 530 includes:

a quality determination unit 740 for determining a content quality score of the target producer based on the N contents and the classification model;

a level determining unit 750 for determining a recommendation level of the target producer according to the domain concentration of the N contents and the content quality score of the target producer.

The functions of each unit, module or sub-module in each apparatus in the embodiments of the present disclosure may refer to the corresponding description in the above method embodiments, and are not described herein again.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

FIG. 8 illustrates a schematic block diagram of an example electronic device 800 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 8, the electronic device 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the electronic apparatus 800 can also be stored. The calculation unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

A number of components in the electronic device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, or the like; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, or the like; and a communication unit 809 such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the electronic device 800 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

Computing unit 801 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and the like. The calculation unit 801 executes the respective methods and processes described above, such as the content processing method. For example, in some embodiments, the content processing method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 808. In some embodiments, part or all of the computer program can be loaded and/or installed onto the electronic device 800 via the ROM 802 and/or the communication unit 809. When loaded into RAM 803 and executed by the computing unit 801, a computer program may perform one or more steps of the content processing method described above. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the content processing method by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel or sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A content processing method, comprising:

determining M knowledge fields related to N contents based on labels of the N contents produced by a target producer; wherein N and M are positive integers;

determining a domain concentration ratio of the N contents based on the number of contents respectively corresponding to the M knowledge domains in the N contents;

determining a recommendation level for the target producer based on the domain concentrations of the N content.

2. The method of claim 1, wherein the determining the domain concentrations of the N contents based on the number of contents of the N contents respectively corresponding to the M knowledge domains comprises:

determining category entropies of the N contents based on the ratio of the number of contents of the N contents respectively corresponding to the M knowledge fields to N;

and determining the domain concentration ratios of the N contents according to the category entropies of the N contents.

3. The method of claim 1, wherein the determining the domain concentrations of the N contents based on the number of contents of the N contents respectively corresponding to the M knowledge domains comprises:

determining distribution information of a target knowledge domain based on the number of contents respectively corresponding to the M knowledge domains in the N contents when the M knowledge domains include the target knowledge domain;

determining a domain concentration of the N contents based on the distribution information of the target knowledge domain.

4. The method of any of claims 1-3, further comprising:

5. The method of any of claims 1-3, wherein the determining a recommendation level for the target producer based on the domain concentration of the N content comprises:

determining a content quality score for the target producer based on the N content and a classification model;

6. A content processing apparatus comprising:

the domain determining module is used for determining M knowledge domains related to N contents based on the labels of the N contents produced by the target producer; wherein N and M are positive integers;

a concentration determination module configured to determine a domain concentration of the N contents based on the number of contents of the N contents respectively corresponding to the M knowledge domains;

a level determination module to determine a recommendation level for the target producer based on the domain concentrations of the N contents.

7. The apparatus of claim 6, wherein the concentration determination module comprises:

a category entropy determination unit, configured to determine category entropies of the N pieces of content based on a ratio of N to the number of pieces of content of the N pieces of content respectively corresponding to the M knowledge fields;

and the first concentration determining unit is used for determining the domain concentrations of the N contents according to the category entropies of the N contents.

8. The apparatus of claim 6, wherein the concentration determination module comprises:

a distribution determination unit configured to determine distribution information of a target knowledge domain based on the number of contents corresponding to the M knowledge domains, respectively, of the N contents, when the M knowledge domains include the target knowledge domain;

a second concentration determination unit for determining the domain concentrations of the N contents based on the distribution information of the target knowledge domain.

9. The apparatus of any of claims 6-8, further comprising:

and the task recommendation module is used for determining the recommended production task of the target producer according to the recommended level of the target producer.

10. The apparatus of any of claims 6-8, wherein the level determination module comprises:

a quality determination unit for determining a content quality score of the target producer based on the N contents and a classification model;

and the level determining unit is used for determining the recommendation level of the target producer according to the domain concentration of the N contents and the content quality score of the target producer.

11. An electronic device, comprising:

at least one processor; and

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.

12. A non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method of any one of claims 1-5.

13. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-5.