CN109857838B - Method and apparatus for generating information - Google Patents

Method and apparatus for generating information Download PDF

Info

Publication number
CN109857838B
CN109857838B CN201910111287.8A CN201910111287A CN109857838B CN 109857838 B CN109857838 B CN 109857838B CN 201910111287 A CN201910111287 A CN 201910111287A CN 109857838 B CN109857838 B CN 109857838B
Authority
CN
China
Prior art keywords
text
similarity
comment information
processed
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910111287.8A
Other languages
Chinese (zh)
Other versions
CN109857838A (en
Inventor
赵杨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Douyin Vision Co Ltd
Douyin Vision Beijing Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Priority to CN201910111287.8A priority Critical patent/CN109857838B/en
Publication of CN109857838A publication Critical patent/CN109857838A/en
Application granted granted Critical
Publication of CN109857838B publication Critical patent/CN109857838B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the disclosure discloses a method and a device for generating information. One embodiment of the method comprises: acquiring a text to be processed; acquiring at least one piece of comment information of a text to be processed; determining the similarity between the text to be processed and at least one piece of comment information as target similarity; and generating quality information of the text to be processed according to the target similarity, wherein the quality information is used for representing the quality of the text to be processed. The embodiment realizes that the evaluation on the quality of the text is finished by utilizing the similarity of the text and the comment thereof.

Description

Method and apparatus for generating information
Technical Field
Embodiments of the present disclosure relate to the field of computer technologies, and in particular, to a method and an apparatus for generating information.
Background
In the era of the current rapid development and popularization of the internet, searching and pushing are the main ways for users to obtain information. Users can browse a wide variety of information using search and push. In the internet, text is one of the most common forms of carrying information.
A user may browse a vast amount of text on the internet each day. These texts include, for example, texts that the user finds by searching, texts that the user browses on some internet platforms, texts that the internet platforms or applications on the mobile side push to the user, and so on.
With the explosive growth of the amount of text on the internet, the quality of the text is also becoming an area of concern. How to screen out texts with better quality from massive texts is a problem to be researched. At present, it is one of the common ways to judge the quality of a text through manual review.
Disclosure of Invention
Embodiments of the present disclosure propose methods and apparatuses for generating information.
In a first aspect, an embodiment of the present disclosure provides a method for generating information, the method including: acquiring a text to be processed; acquiring at least one piece of comment information of a text to be processed; determining the similarity between the text to be processed and at least one piece of comment information as target similarity; and generating quality information of the text to be processed according to the target similarity, wherein the quality information is used for representing the quality of the text to be processed.
In some embodiments, determining the similarity between the text to be processed and the at least one piece of comment information as the target similarity includes: respectively determining the similarity between the comment information in at least one piece of comment information and the text to be processed to obtain a similarity set; respectively determining the weight value of the comment information in at least one piece of comment information; and determining the weighted average of the similarity in the similarity set as the target similarity.
In some embodiments, determining the similarity between the text to be processed and the at least one piece of comment information as the target similarity includes: splitting a text to be processed into at least two sub-texts; for the subfiles in at least two subfiles, determining the similarity between the subfiles and the comment information in at least one piece of comment information to obtain a similarity set corresponding to the subfiles; respectively determining the weight value of the comment information in at least one piece of comment information; determining a weighted average of the similarity in the similarity set corresponding to the sub-document as a target similarity corresponding to the sub-document; and determining the target similarity corresponding to the text to be processed according to the target similarity corresponding to the sub-texts in the at least two sub-texts.
In some embodiments, determining the target similarity corresponding to the text to be processed according to the target similarity corresponding to each of the at least two sub-texts includes: and determining the average value or the maximum value of the target similarity corresponding to the sub-texts in the at least two sub-texts as the target similarity corresponding to the text to be processed.
In some embodiments, determining the weight value of each comment information in the at least one comment information includes: for comment information in at least one piece of comment information, obtaining statistical information of user operation corresponding to the comment information; and determining the weight of the comment information according to the statistical information.
In a second aspect, an embodiment of the present disclosure provides an apparatus for generating information, the apparatus including: an acquisition unit configured to acquire a text to be processed; the above-mentioned obtaining unit is further configured to obtain at least one piece of comment information of the text to be processed; a determining unit configured to determine a similarity of the text to be processed and the at least one piece of comment information as a target similarity; and the generating unit is configured to generate quality information of the text to be processed according to the target similarity, wherein the quality information is used for representing the quality of the text to be processed.
In some embodiments, the determining unit is further configured to determine similarity between comment information in the at least one piece of comment information and the text to be processed, respectively, to obtain a similarity set; respectively determining the weight value of the comment information in at least one piece of comment information; and determining the weighted average of the similarity in the similarity set as the target similarity.
In some embodiments, the determining unit is further configured to split the text to be processed into at least two sub-texts; for the subfiles in at least two subfiles, determining the similarity between the subfiles and the comment information in at least one piece of comment information to obtain a similarity set corresponding to the subfiles; respectively determining the weight value of the comment information in at least one piece of comment information; determining a weighted average of the similarity in the similarity set corresponding to the sub-document as a target similarity corresponding to the sub-document; and determining the target similarity corresponding to the text to be processed according to the target similarity corresponding to the sub-texts in the at least two sub-texts.
In some embodiments, the determining unit is further configured to determine, as the target similarity corresponding to the text to be processed, an average value or a maximum value of the target similarities corresponding to the sub-texts in the at least two sub-texts.
In some embodiments, the determining unit is further configured to, for comment information in the at least one piece of comment information, obtain statistical information of a user operation corresponding to the comment information; and determining the weight of the comment information according to the statistical information.
In a third aspect, an embodiment of the present disclosure provides a method for pushing information, including: acquiring a candidate pushed text set; for a candidate pushed text in the candidate pushed text set, generating quality information of the candidate pushed text by using the method described in any implementation manner in the first aspect; and selecting a candidate pushed text with corresponding quality information meeting preset conditions from the candidate pushed text set, and pushing the selected candidate pushed text.
In a fourth aspect, an embodiment of the present disclosure provides a server, including: one or more processors; storage means for storing one or more programs; when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the method as described in any implementation of the first aspect.
In a fifth aspect, embodiments of the present disclosure provide a computer-readable medium on which a computer program is stored, which computer program, when executed by a processor, implements the method as described in any of the implementations of the first aspect.
According to the method and the device for generating the information, the text to be processed is obtained; acquiring at least one piece of comment information of a text to be processed; determining the similarity between the text to be processed and at least one piece of comment information as target similarity; and generating quality information of the text to be processed according to the target similarity, wherein the quality information is used for representing the quality of the text to be processed, so that the evaluation on the quality of the text is completed by utilizing the similarity between the text and the comments of the text, the number of features which can be used for representing the text is increased, and the feature of the quality of the text can be further applied to the analysis and processing related to the text.
Drawings
Other features, objects and advantages of the disclosure will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present disclosure may be applied;
FIG. 2 is a flow diagram for one embodiment of a method for generating information, according to the present disclosure;
FIG. 3 is a schematic diagram of one application scenario of a method for generating information in accordance with an embodiment of the present disclosure;
FIG. 4 is a flow diagram of yet another embodiment of a method for generating information according to the present disclosure;
FIG. 5 is a flow diagram for one embodiment of a method for pushing information, according to the present disclosure;
FIG. 6 is a schematic block diagram illustrating one embodiment of an apparatus for generating information according to the present disclosure;
FIG. 7 is a schematic structural diagram of an electronic device suitable for use in implementing embodiments of the present disclosure.
Detailed Description
The present disclosure is described in further detail below with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1 illustrates an exemplary architecture 100 to which embodiments of the disclosed method for generating information or apparatus for generating information may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The terminal devices 101, 102, 103 interact with a server 105 via a network 104 to receive or send messages or the like. Various client applications may be installed on the terminal devices 101, 102, 103. Such as browser-type applications, reading-type applications, content-sharing-type applications, search-type applications, social platform applications, and the like.
The terminal apparatuses 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices including, but not limited to, smart phones, tablet computers, e-book readers, laptop portable computers, desktop computers, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as multiple pieces of software or software modules (e.g., multiple pieces of software or software modules to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.
The server 105 may be a server that provides various services, such as a backend server that provides support for client applications installed on the terminal devices 101, 102, 103. The server 105 may determine similarity of the text displayed on the terminal device and at least one piece of comment information of the text, and generate quality information of the text according to the similarity. Further, quality information of the generated text may also be stored in association with the text.
It should be noted that the text and at least one piece of comment information of the text may also be directly stored in a local database of the server 105 or a database corresponding to the server 105. At this time, the server 105 may directly extract and process the text stored in the local or corresponding database and at least one piece of comment information of the text, and at this time, the terminal apparatuses 101, 102, 103 and the network 104 may not exist.
It should be noted that the method for generating information provided by the embodiment of the present disclosure is generally performed by the server 105, and accordingly, the apparatus for generating information is generally disposed in the server 105.
The server 105 may be hardware or software. When the server 105 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the server 105 is software, it may be implemented as multiple pieces of software or software modules (e.g., multiple pieces of software or software modules used to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow 200 of one embodiment of a method for generating information in accordance with the present disclosure is shown. The method for generating information comprises the following steps:
step 201, obtaining a text to be processed.
In this embodiment, an executing body (e.g., the server 105 shown in fig. 1) of the method for generating information may first acquire a text to be processed from a local or other storage device (e.g., the terminal devices 101, 102, 103, etc. shown in fig. 1). Of course, the execution subject may also obtain the text to be processed from the corresponding database or the third-party data platform.
The text to be processed may be various texts. The text to be processed may be a text designated by a technician or a text to be processed by a user sending an instruction. The text to be processed may also be a text determined according to a preset condition. In different application scenarios, the text to be processed may be different.
Step 202, at least one piece of comment information of the text to be processed is obtained.
In this embodiment, after the text to be processed is determined, comment information of the text to be processed may be further acquired. The comment information may refer to information for analyzing and evaluating the text to be processed. The review information may be used to set forth a perspective or attitude.
Similarly, the execution subject may obtain at least one piece of comment information of the text to be processed from a local storage device, another storage device, a corresponding database, or a third-party data platform. Generally, the text to be processed and the comment information of the text to be processed can be stored in association.
Because the text to be processed may have many comment information, in this case, part or all of the comment information of the text to be processed may be acquired according to the actual application requirements.
Step 203, determining the similarity between the text to be processed and at least one piece of comment information as the target similarity.
In this embodiment, the similarity between the text to be processed and at least one piece of comment information may be determined by using various existing text similarity determination methods. For example, keyword matching based algorithms, vector space based algorithms, deep learning based algorithms, and the like. The algorithm based on keyword matching includes, for example, N-Gram (chinese language model). The Vector space-based algorithm includes, for example, TF-IDF (Term Frequency-Inverse Document Frequency), Word2vec (Word To Vector), etc., and the Deep learning-based algorithm includes, for example, DDSM (Deep Structured Semantic Model), etc.
Alternatively, at least one piece of comment information may be merged into one text, and then the similarity between the text to be processed and the text corresponding to the comment information is determined as the similarity between the text to be processed and at least one piece of comment information by using the above-described various text similarity determination methods.
Optionally, the similarity between the comment information in the at least one piece of comment information and the text to be processed may be respectively determined, so as to obtain a similarity set; respectively determining the weight value of the comment information in at least one piece of comment information; and determining the weighted average of the similarity in the similarity set as the target similarity.
The weight value of each piece of comment information may be specified in advance by a technician, or may be determined according to the related information of each piece of comment information. For example, different weight values may be set according to the comment time corresponding to each piece of comment information.
Optionally, for comment information in at least one piece of comment information, statistical information of user operation corresponding to the comment information may be acquired, and then, according to the statistical information, a weight of the comment information may be determined.
The user operation may refer to various interactive operations between the user and the comment information. It should be understood that different terminal devices used by the user, different platforms displaying comment information, and the like may have different forms of user operations.
For example, the user operation may be a comment operation on the comment information, an operation of sharing the comment information to another page, an operation of clicking a control for indicating an attitude (such as support or anti-peer) of the comment information, and the like.
The statistical information of the user operation may refer to some statistical data obtained after the user operation is processed by using a statistical method. For example, the statistical information of the user operations may be the total number of various user operations received by the comment information, the total number of comment operations received by the comment information, and the like.
In some application scenarios, a higher weight value may be set for received comment information specifying a greater total number of user operations.
The weighted average value can make the quality evaluation of the text to be processed a relative value, and is convenient for quality sorting and other processing with other texts to be processed.
In practice, various text similarity determination methods can be flexibly selected and used according to different application scenes or service requirements.
And step 204, generating quality information of the text to be processed according to the target similarity.
In this embodiment, the quality information may be used to characterize the quality of the text to be processed. Under different application scenarios, the representation mode of the quality information can be flexible and changeable. For example, the quality information may be a specific numerical value for representing the quality score. For another example, the quality information may be preset grade marks indicating different degrees of goodness.
Generally, a good quality text will generally result in more analysis or discussion about the text. Therefore, it can be considered that the comment information of a text of good quality generally has a high similarity to the text. If the comment information is information which is not related to the text, it can be considered that the reverberation caused by the text is not very large, so that the quality of the text may be poor.
Based on this, the quality of the text to be processed can be evaluated based on the similarity of the text to be processed and the comment information thereof. Generally, it can be considered that the higher the similarity of the text to be processed and its comment information, the higher the quality of the text.
According to the direct proportion relation between the target similarity and the quality of the text to be processed, different methods for generating the quality information can be set according to the representation method of the quality information. For example, if a specific numerical value is used to represent the quality of the text to be processed, the target similarity may be directly used as the quality score of the text to be processed. At this time, the higher the quality score, the higher the quality of the text to be processed can be represented.
For another example, if the level identifier is used to indicate the quality of the text to be processed. The correspondence between the similarity and the quality corresponding to the different level identifications may be set in advance. By way of example, three level identifications of "a", "B", and "C" are used to represent the quality of the text to be processed. The level identifiers "a", "B", and "C" correspond to different similarity intervals, respectively. At this time, a corresponding grade identifier may be generated as quality information of the text to be processed according to the similarity interval where the determined target similarity exists.
With continued reference to fig. 3, fig. 3 is a schematic diagram 300 of an application scenario of the method for generating information according to the present embodiment. In the application scenario of fig. 3, the executing entity may obtain the text to be processed 301 from the corresponding database, and then obtain three pieces of comment information (as shown by reference numerals 302, 303, and 304 in the figure) corresponding to the text to be processed 301.
Then, the text to be processed 301 and the three pieces of comment information may be represented as corresponding feature vectors, respectively, based on a VSM (Vector Space Model). Then, the similarity of the feature vector of the text to be processed 301 and the feature vectors of the three pieces of comment information may be calculated, respectively.
Then, as indicated by reference numeral 305 in the figure, an average value of the similarity degrees corresponding to the obtained three pieces of comment information, respectively, may be determined as the quality score of the text to be processed 301.
The method provided by the above embodiment of the present disclosure evaluates the quality of the text according to the similarity between the text and the comments thereof, so that a specific representation of the quality of the text can be realized, the number of features that can be used for characterizing the text is increased, and further, the feature of the quality of the text can also be applied to text-related analysis and processing.
With further reference to fig. 4, a flow 400 of yet another embodiment of a method for generating information is shown. The flow 400 of the method for generating information comprises the steps of:
step 401, obtaining a text to be processed.
Step 402, at least one piece of comment information of the text to be processed is obtained.
The specific implementation process of steps 401 and 402 can refer to the related description of steps 201 and 202 in the corresponding embodiment of fig. 2, and is not repeated herein.
Step 403, splitting the text to be processed into at least two sub-texts.
In this embodiment, different splitting modes may be selected according to different application scenarios and service requirements. For example, the text to be processed may be split according to paragraphs, and each paragraph of the text to be processed is taken as one sub-text. For another example, the text to be processed may be split according to sentences, and each sentence of the text to be processed is taken as one sub-text.
Step 404, for the sub-texts in at least two sub-texts, the following steps 4041 and 4043 are executed:
step 4041, determining similarity between the sub-document and the comment information in the at least one piece of comment information, and obtaining a similarity set corresponding to the sub-document.
Step 4042, determining the weight values of the comment information in the at least one piece of comment information respectively.
Step 4043, determine the weighted average of the similarities in the similarity set corresponding to the sub-document as the target similarity corresponding to the sub-document.
The specific implementation process of steps 4041, 4042, and 4043 may refer to the related description of step 203 in the corresponding embodiment of fig. 2, and will not be described herein again.
Step 405, determining the target similarity corresponding to the text to be processed according to the target similarity corresponding to each of the sub-texts in the at least two sub-texts.
In this embodiment, the target similarity corresponding to the text to be processed may be determined by comprehensively considering the target similarity corresponding to each sub-text.
Optionally, an average value of the target similarities corresponding to the sub-texts in the at least two sub-texts may be determined as the target similarity corresponding to the text to be processed.
Optionally, the maximum value of the target similarities corresponding to the sub-texts in the at least two sub-texts may be determined as the target similarity corresponding to the text to be processed.
As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, the flow 400 of the method for generating information in this embodiment highlights a step of splitting the text to be processed into at least two sub-texts, and then determining the quality information of the text to be processed by comprehensively considering the similarity between each sub-text and each piece of comment information. Therefore, when the content of some texts to be processed is long or the consistency of the content of the texts to be processed is poor, the texts to be processed can be processed in a split mode, and the accuracy of the quality information of the determined texts to be processed is improved.
With continued reference to fig. 5, a flow 500 of one embodiment of a method for pushing information in accordance with the present disclosure is shown. The method for pushing the information comprises the following steps:
step 501, obtaining a candidate pushed text set.
In this embodiment, an executing entity (e.g., the server 105 shown in fig. 1) of the method for pushing information may first obtain a candidate pushed text set from a corresponding database or other data platform. The candidate pushed texts may be various texts that can be pushed.
Step 502, for a candidate pushed text in the candidate pushed text set, generating quality information of the candidate pushed text.
In the present embodiment, the quality information of each candidate pushed text may be generated by using the method for generating information described in the embodiments corresponding to fig. 2 and fig. 4.
Step 503, selecting a candidate pushed text with corresponding quality information meeting a preset condition from the candidate pushed text set, and pushing the selected candidate pushed text.
In this step, the preset condition may be preset by a technician according to application requirements. For example, when the quality information is represented by a specific numerical value, the preset condition may be that the quality information is greater than a preset threshold.
The method provided by the embodiment of the disclosure limits the quality of the text according to the preset condition, thereby screening the candidate pushed texts in the candidate pushed text set, filtering out the candidate pushed texts which do not meet the preset condition, effectively reducing the number of the pushed texts, and reducing the flow consumption of the terminal device receiving the pushed texts. Meanwhile, the exposure rate of the text with higher quality can be increased in the mode, and the exposure rate of the text with lower quality is also reduced.
With further reference to fig. 6, as an implementation of the methods shown in the above figures, the present disclosure provides an embodiment of an apparatus for generating information, which corresponds to the method embodiment shown in fig. 2, and which is particularly applicable in various electronic devices.
As shown in fig. 6, the apparatus 600 for generating information provided by the present embodiment includes an acquisition unit 601, a determination unit 602, and a generation unit 603. Wherein the obtaining unit 601 is configured to obtain a text to be processed; the obtaining unit 601 is further configured to obtain at least one piece of comment information of the text to be processed; the determining unit 602 is configured to determine a similarity between the text to be processed and at least one piece of comment information as a target similarity; the generating unit 603 is configured to generate quality information of the text to be processed according to the target similarity, wherein the quality information is used for representing the quality of the text to be processed.
In the present embodiment, in the apparatus 600 for generating information: the specific processing of the obtaining unit 601, the determining unit 602, and the generating unit 603 and the technical effects thereof can refer to the related descriptions of step 201, step 202, and step 203 in the corresponding embodiment of fig. 2, which are not repeated herein.
In some optional implementations of this embodiment, the determining unit is further configured to: respectively determining the similarity between the comment information in at least one piece of comment information and the text to be processed to obtain a similarity set; respectively determining the weight value of the comment information in at least one piece of comment information; and determining the weighted average of the similarity in the similarity set as the target similarity.
In some optional implementations of this embodiment, the determining unit is further configured to: splitting a text to be processed into at least two sub-texts; for the subfiles in at least two subfiles, determining the similarity between the subfiles and the comment information in at least one piece of comment information to obtain a similarity set corresponding to the subfiles; respectively determining the weight value of the comment information in at least one piece of comment information; determining a weighted average of the similarity in the similarity set corresponding to the sub-document as a target similarity corresponding to the sub-document; and determining the target similarity corresponding to the text to be processed according to the target similarity corresponding to the sub-texts in the at least two sub-texts.
In some optional implementations of this embodiment, the determining unit is further configured to: and determining the average value or the maximum value of the target similarity corresponding to the sub-texts in the at least two sub-texts as the target similarity corresponding to the text to be processed.
In some optional implementations of this embodiment, the determining unit is further configured to: for comment information in at least one piece of comment information, obtaining statistical information of user operation corresponding to the comment information; and determining the weight of the comment information according to the statistical information.
According to the device provided by the embodiment of the disclosure, the to-be-processed text is acquired through the acquisition unit, and at least one piece of comment information of the to-be-processed text is acquired; the determining unit determines the similarity between the text to be processed and at least one piece of comment information as the target similarity; the generating unit generates the quality information of the text to be processed according to the target similarity, wherein the quality information is used for representing the quality of the text to be processed, so that the evaluation on the quality of the text is completed by utilizing the similarity between the text and the comments of the text, the number of the features which can be used for representing the text is increased, and the feature of the quality of the text can be further applied to the analysis and processing related to the text.
Referring now to FIG. 7, a block diagram of an electronic device (e.g., the server of FIG. 1) 700 suitable for use in implementing embodiments of the present disclosure is shown. The server shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 7, electronic device 700 may include a processing means (e.g., central processing unit, graphics processor, etc.) 701 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from storage 708 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data necessary for the operation of the electronic apparatus 700 are also stored. The processing device 701, the ROM 702, and the RAM703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
Generally, the following devices may be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 707 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 708 including, for example, magnetic tape, hard disk, etc.; and a communication device 709. The communication means 709 may allow the electronic device 700 to communicate wirelessly or by wire with other devices to exchange data. While fig. 7 illustrates an electronic device 700 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 7 may represent one device or may represent multiple devices as desired.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via the communication means 709, or may be installed from the storage means 708, or may be installed from the ROM 702. The computer program, when executed by the processing device 701, performs the above-described functions defined in the methods of embodiments of the present disclosure.
It should be noted that the computer readable medium described in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In embodiments of the present disclosure, however, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the server; or may exist separately and not be assembled into the server. The computer readable medium carries one or more programs which, when executed by the server, cause the server to: acquiring a text to be processed; acquiring at least one piece of comment information of a text to be processed; determining the similarity between the text to be processed and at least one piece of comment information as target similarity; and generating quality information of the text to be processed according to the target similarity, wherein the quality information is used for representing the quality of the text to be processed.
Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor comprising: the device comprises an acquisition unit, a determination unit and a generation unit. The names of these units do not in some cases constitute a limitation on the unit itself, and for example, the acquiring unit may also be described as a "unit that acquires text to be processed".
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is made without departing from the inventive concept as defined above. For example, the above features and (but not limited to) technical features with similar functions disclosed in the embodiments of the present disclosure are mutually replaced to form the technical solution.

Claims (11)

1. A method for generating information, comprising:
acquiring a text to be processed;
acquiring at least one piece of comment information of the text to be processed;
determining the similarity between the text to be processed and the at least one piece of comment information as a target similarity;
generating quality information of the text to be processed according to the target similarity, wherein the quality information is used for representing the quality of the text to be processed;
wherein the determining the similarity between the text to be processed and the at least one piece of comment information as a target similarity includes: splitting the text to be processed into at least two sub-texts; for the subfiles in the at least two subfiles, determining the similarity between the subfiles and the comment information in the at least one piece of comment information respectively to obtain a similarity set corresponding to the subfiles; respectively determining the weight values of the comment information in the at least one piece of comment information; determining a weighted average of the similarity in the similarity set corresponding to the sub-document as a target similarity corresponding to the sub-document; and determining the target similarity corresponding to the text to be processed according to the target similarity corresponding to each of the sub-texts in the at least two sub-texts.
2. The method of claim 1, wherein the determining the similarity of the text to be processed and the at least one piece of comment information as a target similarity comprises:
respectively determining the similarity between the comment information in the at least one piece of comment information and the text to be processed to obtain a similarity set;
respectively determining the weight values of the comment information in the at least one piece of comment information;
and determining the weighted average of the similarity in the similarity set as the target similarity.
3. The method according to claim 1, wherein the determining the target similarity corresponding to the text to be processed according to the target similarity corresponding to each of the sub-texts in the at least two sub-texts comprises:
and determining the average value or the maximum value of the target similarity corresponding to the sub-texts in the at least two sub-texts as the target similarity corresponding to the text to be processed.
4. The method of claim 1 or 2, wherein the determining the weight value of the comment information of the at least one comment information respectively comprises:
for the comment information in the at least one piece of comment information, obtaining statistical information of user operation corresponding to the comment information; and determining the weight of the comment information according to the statistical information.
5. An apparatus for generating information, comprising:
an acquisition unit configured to acquire a text to be processed;
the obtaining unit is further configured to obtain at least one piece of comment information of the text to be processed;
a determining unit configured to determine a similarity of the text to be processed and the at least one piece of comment information as a target similarity;
the generating unit is configured to generate quality information of the text to be processed according to the target similarity, wherein the quality information is used for representing the quality of the text to be processed;
the determination unit is further configured to: splitting the text to be processed into at least two sub-texts; for the subfiles in the at least two subfiles, determining the similarity between the subfiles and the comment information in the at least one piece of comment information respectively to obtain a similarity set corresponding to the subfiles; respectively determining the weight values of the comment information in the at least one piece of comment information; determining a weighted average of the similarity in the similarity set corresponding to the sub-document as a target similarity corresponding to the sub-document; and determining the target similarity corresponding to the text to be processed according to the target similarity corresponding to each of the sub-texts in the at least two sub-texts.
6. The apparatus of claim 5, wherein the determination unit is further configured to:
respectively determining the similarity between the comment information in the at least one piece of comment information and the text to be processed to obtain a similarity set;
respectively determining the weight values of the comment information in the at least one piece of comment information;
and determining the weighted average of the similarity in the similarity set as the target similarity.
7. The apparatus of claim 5, wherein the determination unit is further configured to:
and determining the average value or the maximum value of the target similarity corresponding to the sub-texts in the at least two sub-texts as the target similarity corresponding to the text to be processed.
8. The apparatus of claim 5 or 6, wherein the determining unit is further configured to:
for the comment information in the at least one piece of comment information, obtaining statistical information of user operation corresponding to the comment information; and determining the weight of the comment information according to the statistical information.
9. A method for pushing information, comprising:
acquiring a candidate pushed text set;
for a candidate pushed text in the set of candidate pushed texts, generating quality information of the candidate pushed text by using the method according to one of claims 1 to 4;
and selecting a candidate pushed text with corresponding quality information meeting preset conditions from the candidate pushed text set, and pushing the selected candidate pushed text.
10. A server, comprising:
one or more processors;
a storage device having one or more programs stored thereon;
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-4.
11. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-4.
CN201910111287.8A 2019-02-12 2019-02-12 Method and apparatus for generating information Active CN109857838B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910111287.8A CN109857838B (en) 2019-02-12 2019-02-12 Method and apparatus for generating information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910111287.8A CN109857838B (en) 2019-02-12 2019-02-12 Method and apparatus for generating information

Publications (2)

Publication Number Publication Date
CN109857838A CN109857838A (en) 2019-06-07
CN109857838B true CN109857838B (en) 2021-01-26

Family

ID=66897608

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910111287.8A Active CN109857838B (en) 2019-02-12 2019-02-12 Method and apparatus for generating information

Country Status (1)

Country Link
CN (1) CN109857838B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287910A (en) * 2019-06-28 2019-09-27 北京百度网讯科技有限公司 For obtaining the method and device of information

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102254038A (en) * 2011-08-11 2011-11-23 武汉安问科技发展有限责任公司 System and method for analyzing network comment relevance
CN105844424A (en) * 2016-05-30 2016-08-10 中国计量学院 Product quality problem discovery and risk assessment method based on network comments
CN107885768A (en) * 2017-09-27 2018-04-06 昆明理工大学 A kind of user comment method for digging for APP software use qualities

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9652868B2 (en) * 2014-06-26 2017-05-16 Amazon Technologies, Inc. Automatic color palette based recommendations
US20160048768A1 (en) * 2014-08-15 2016-02-18 Here Global B.V. Topic Model For Comments Analysis And Use Thereof
WO2017065742A1 (en) * 2015-10-12 2017-04-20 Hewlett-Packard Development Company, L.P.. Concept map assessment
CN106600482A (en) * 2016-12-30 2017-04-26 西北工业大学 Multi-source social data fusion multi-angle travel information perception and intelligent recommendation method
CN107239512B (en) * 2017-05-18 2019-10-08 华中科技大学 A kind of microblogging comment spam recognition methods of combination comment relational network figure

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102254038A (en) * 2011-08-11 2011-11-23 武汉安问科技发展有限责任公司 System and method for analyzing network comment relevance
CN105844424A (en) * 2016-05-30 2016-08-10 中国计量学院 Product quality problem discovery and risk assessment method based on network comments
CN107885768A (en) * 2017-09-27 2018-04-06 昆明理工大学 A kind of user comment method for digging for APP software use qualities

Also Published As

Publication number Publication date
CN109857838A (en) 2019-06-07

Similar Documents

Publication Publication Date Title
CN109460513B (en) Method and apparatus for generating click rate prediction model
CN108121699B (en) Method and apparatus for outputting information
CN109858045B (en) Machine translation method and device
CN106919711B (en) Method and device for labeling information based on artificial intelligence
CN109359194B (en) Method and apparatus for predicting information categories
CN107798622B (en) Method and device for identifying user intention
CN110619078B (en) Method and device for pushing information
CN113688310B (en) Content recommendation method, device, equipment and storage medium
CN108121814B (en) Search result ranking model generation method and device
CN110737824B (en) Content query method and device
CN109190123B (en) Method and apparatus for outputting information
WO2024099171A1 (en) Video generation method and apparatus
CN110059172B (en) Method and device for recommending answers based on natural language understanding
CN109992719B (en) Method and apparatus for determining push priority information
CN108491387B (en) Method and apparatus for outputting information
CN111026849B (en) Data processing method and device
CN108509442B (en) Search method and apparatus, server, and computer-readable storage medium
CN110852057A (en) Method and device for calculating text similarity
CN112182255A (en) Method and apparatus for storing media files and for retrieving media files
CN109857838B (en) Method and apparatus for generating information
CN112231444A (en) Processing method and device for corpus data combining RPA and AI and electronic equipment
CN110881056A (en) Method and device for pushing information
CN112148751B (en) Method and device for querying data
CN112148865B (en) Information pushing method and device
CN111767290B (en) Method and apparatus for updating user portraits

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Patentee after: Tiktok vision (Beijing) Co.,Ltd.

Address before: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Patentee before: BEIJING BYTEDANCE NETWORK TECHNOLOGY Co.,Ltd.

Address after: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Patentee after: Douyin Vision Co.,Ltd.

Address before: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Patentee before: Tiktok vision (Beijing) Co.,Ltd.