Specific embodiment
The disclosure is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to
Convenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the disclosure can phase
Mutually combination.The disclosure is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 is shown can be using the method for generating information of the disclosure or the implementation of the device for generating information
The exemplary architecture 100 of example.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105.
Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with
Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
Terminal device 101,102,103 is interacted by network 104 with server 105, to receive or send message etc..Terminal
Various client applications can be installed in equipment 101,102,103.Such as the application of browser class, the application of reading class, content point
Enjoy class application, searching class application, social platform application etc..
Terminal device 101,102,103 can be hardware, be also possible to software.When terminal device 101,102,103 is hard
When part, it can be various electronic equipments, including but not limited to smart phone, tablet computer, E-book reader, on knee portable
Computer and desktop computer etc..When terminal device 101,102,103 is software, above-mentioned cited electricity may be mounted at
In sub- equipment.Multiple softwares or software module may be implemented into (such as providing multiple softwares of Distributed Services or soft in it
Part module), single software or software module also may be implemented into.It is not specifically limited herein.
Server 105 can be to provide the server of various services, for example, install on terminal device 101,102,103
Client application provides the back-end server supported.Server 105 can determine the text and text shown on terminal device
At least one comment information similarity, and according to similarity generate text quality information.It further, can also be by life
At text quality information and textual association store.
It should be noted that at least one comment information of above-mentioned text and text can also be stored directly in server
In 105 local or the corresponding database of server 105.At this point, server 105 can directly extract local or corresponding data
At least one comment information of the text and text that are stored in library is simultaneously handled, at this point it is possible to which terminal device is not present
101,102,103 and network 104.
It should be noted that for generating the method for information generally by server 105 provided by embodiment of the disclosure
It executes, correspondingly, the device for generating information is generally positioned in server 105.
It should be noted that server 105 can be hardware, it is also possible to software.It, can when server 105 is hardware
To be implemented as the distributed server cluster that multiple servers form, individual server also may be implemented into.When server 105 is
When software, multiple softwares or software module may be implemented into (such as providing multiple softwares of Distributed Services or software mould
Block), single software or software module also may be implemented into.It is not specifically limited herein.
It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need
It wants, can have any number of terminal device, network and server.
With continued reference to Fig. 2, it illustrates the processes according to one embodiment of the method for generating information of the disclosure
200.This be used for generate information method the following steps are included:
Step 201, text to be processed is obtained.
It in the present embodiment, can be first for generating the executing subject (server 105 as shown in Figure 1) of the method for information
Text to be processed is obtained from local or other storage equipment (terminal device 101,102,103 as shown in Figure 1 etc.).Certainly, on
Text to be processed can also be obtained from its corresponding database or third party's data platform by stating executing subject.
Wherein, text to be processed can be various texts.Text to be processed can be the text that technical staff specifies, can also
To be that user sends instruction text to be dealt with.Text to be processed is also possible to the text determined according to preset condition.Not
Under same application scenarios, text to be processed can be different.
Step 202, at least one comment information of text to be processed is obtained.
In the present embodiment, after text to be processed has been determined, the comment letter of text to be processed can further be obtained
Breath.Wherein, comment information can refer to the information being analyzed and evaluated to text to be processed.Comment information can be used for illustrating one
Kind viewpoint or attitude.
Similarly, above-mentioned executing subject can be flat from local, other storage equipment, corresponding database or third party's data
Platform obtains at least one comment information of text to be processed.Generally, the comment information of text to be processed and text to be processed can
With associated storage.
Since text to be processed may have many comment informations in this case can be according to actual application need
It asks, obtains some or all of text to be processed comment information.
Step 203, determine the similarity of text to be processed and at least one comment information as target similarity.
In the present embodiment, can use the determination method of existing various text similarities determine text to be processed with
The similarity of at least one comment information.For example, the algorithm based on Keywords matching, the algorithm based on vector space, based on deep
Spend the algorithm etc. of study.Wherein, the algorithm based on Keywords matching includes such as N-Gram (Chinese model).Based on vector sky
Between algorithm include such as TF-IDF (Term Frequency-Inverse Document Frequency), Word2vec (Word
To Vector) etc., algorithm based on deep learning include such as DDSM (Deep Structured Semantic Model, depth
Structuring semantic model) etc..
It is alternatively possible at least one comment information is merged into a text, it is then similar using above-mentioned various texts
The determination method of degree come determine the similarity of text to be processed text corresponding with comment information as text to be processed at least
The similarity of one comment information.
It is alternatively possible to the similarity of the comment information and text to be processed at least one comment information is determined respectively,
Obtain similarity collection;The weighted value of the comment information at least one comment information is determined respectively;Determine the phase that similarity is concentrated
Like degree weighted average as target similarity.
Wherein, it for the weighted value of each comment information, can be preassigned, can also be commented according to each item by technical staff
It is determined by the relevant information of information.For example, different power can be arranged according to each comment information corresponding comment time
Weight values.
Optionally, for the comment information at least one comment information, the corresponding user of the available comment information
The statistical information of operation can determine the weight of the comment information then according to statistical information.
Wherein, user's operation can refer to the various interactive operations between user and comment information.It is to be understood that user
The different terminal devices used, the different platform for showing comment information etc. might have various forms of user's operations.
For example, user's operation can be the operation of the comment to the comment information, share the comment information to other pages
Operation is clicked for indicating operation of control of attitude (such as support or oppose) to the comment information etc..
The statistical information of user's operation can refer to user's operation is handled using statistical method after obtain it is some
Statistical data.For example, the statistical information of user's operation can be the total degree for the various user's operations that comment information receives, comment
By the total degree etc. for the comment operation that information receives.
Under application scenes, the more comment information of the total degree that can be operated to the designated user received is arranged
Higher weighted value.
Wherein, it can to treat the quality evaluation of processing text to be a relative value using weighted average, be also convenient for
The processing such as quality-ordered is carried out with other texts to be processed.
, can be according to different application scenarios or business demand in practice, flexible selection is similar with using various texts
The determination method of degree.
Step 204, according to target similarity, the quality information of text to be processed is generated.
In the present embodiment, quality information can be used for characterizing the quality of text to be processed.Under different application scenarios,
The representation of quality information can be flexible and changeable.For example, quality information can be specifically, for indicating quality score
Numerical value.In another example quality information can be the preset class letter for being used to indicate different superiority and inferiority degree.
Generally, high-quality text would generally cause more analysis or discussion about the text.It therefore, can be with
The comment information for thinking high-quality text is usually higher with the similarity of the text.If comment information is closed with text
Join little information, it is believed that repercussion caused by the text is not very big, so the quality of the text is possible poor.
Based on this, can be carried out based on the similarity of text to be processed and its comment information come the quality to text to be processed
Assessment.Generally, it is believed that text to be processed and the similarity of its comment information are higher, and the quality of text is higher.
It, can be according to the expression of quality information according to the proportional relation between target similarity and the quality of text to be processed
The method of different generation quality informations is arranged in method.For example, indicating the matter of text to be processed according to specific numerical value
Amount, then can be directly using target similarity as the quality score of text to be processed.At this point, quality score is higher, it can be with table
Show that the quality of text to be processed is higher.
In another example indicating the quality of text to be processed according to class letter.Can preset similarity from it is different
Corresponding relationship between the corresponding quality of class letter.As an example, using " A ", " B ", " C " three grades mark come indicate to
Handle the quality of text.Wherein, class letter " A ", " B ", " C " respectively correspond different similarity sections.At this point it is possible to according to
Similarity section where identified target similarity, the quality for generating corresponding class letter as text to be processed are believed
Breath.
With continued reference to the signal that Fig. 3, Fig. 3 are according to the application scenarios of the method for generating information of the present embodiment
Figure 30 0.In the application scenarios of Fig. 3, above-mentioned executing subject can obtain text 301 to be processed from corresponding database, so
Obtain 301 corresponding three comment informations of text to be processed again afterwards (as shown in figure label 302,303,304).
Later, VSM (Vector Space Model, vector space model) can be based on by text 301 and three to be processed
Comment information is expressed as corresponding feature vector.It is then possible to calculate separately text 301 to be processed feature vector and
The similarity of the feature vector of three comment informations.
It then, being averaged obtain three corresponding similarities of comment information as shown in figure label 305
Value is determined as the quality score of text 301 to be processed.
The method provided by the above embodiment of the disclosure according to text and its comment between similarity assess text
Quality so as to realize that one to the quality of text specifically indicates, and increases the feature that can be used for characterizing text
This feature of the quality of text can also further be applied in the relevant analysis of text and processing by number.
With further reference to Fig. 4, it illustrates the processes 400 of another embodiment of the method for generating information.The use
In the process 400 for the method for generating information, comprising the following steps:
Step 401, text to be processed is obtained.
Step 402, at least one comment information of text to be processed is obtained.
The specific implementation procedure of above-mentioned steps 401 and 402 can refer to step 201 in Fig. 2 corresponding embodiment and 202
Related description, details are not described herein.
Step 403, text to be processed is split into at least two Ziwen sheets.
In the present embodiment, different fractionation modes can be chosen according to different application scenarios and business demand.For example,
Text to be processed can be split according to paragraph, using each paragraph of text to be processed as a sub- text.Example again
Such as, text to be processed can be split according to sentence, is used as a sub- text for each of text to be processed.
Step 404, for the Ziwen sheet at least two Ziwen sheets, following steps 4041-4043 is executed:
Step 4041, it determines the Ziwen originally similarity with the comment information at least one comment information respectively, obtains
The corresponding similarity collection of the Ziwen sheet.
Step 4042, the weighted value of the comment information at least one comment information is determined respectively.
Step 4043, determine the weighted average for the similarity that the corresponding similarity of the Ziwen sheet is concentrated as the Ziwen sheet
Corresponding target similarity.
The specific implementation procedure of above-mentioned steps 4041,4042,4043 can refer to the step 203 in Fig. 2 corresponding embodiment
Related description, details are not described herein.
Step 405, according to the corresponding target similarity of Ziwen sheet at least two Ziwen sheets, text to be processed is determined
This corresponding target similarity.
In the present embodiment, the corresponding target similarity of each Ziwen sheet can be comprehensively considered to determine text to be processed
This corresponding target similarity.
It is alternatively possible to determine that the average value of the corresponding target similarity of Ziwen sheet at least two Ziwen sheets is made
For the corresponding target similarity of text to be processed.
It is alternatively possible to determine the maximum value in the corresponding target similarity of Ziwen sheet at least two Ziwen sheets
As the corresponding target similarity of text to be processed.
Figure 4, it is seen that the method for generating information compared with the corresponding embodiment of Fig. 2, in the present embodiment
Process 400 highlight and text to be processed first can be split as at least two Ziwen sheets, then comprehensively consider each Ziwen sheet
The step of quality information with the similarity of each comment information to determine text to be processed.As a result, in some texts to be processed
When content is longer or the continuity of the content of text to be processed is poor, can use and split the mode of text to be processed to locate
Reason facilitates the accuracy for promoting the quality information of determining text to be processed.
With continued reference to Fig. 5, it illustrates the processes according to one embodiment of the method for pushed information of the disclosure
500.This for pushed information method the following steps are included:
Step 501, candidate push text set is obtained.
It in the present embodiment, can be first for the executing subject of the method for pushed information (server 105 as shown in Figure 1)
Candidate push text set is obtained from corresponding database or other data platforms.Wherein, candidate push text can be it is various can
With the text of push.
Step 502, for the candidate push text in candidate push text set, the quality letter that the candidate pushes text is generated
Breath.
In the present embodiment, it is described for generating the method next life of information to can use Fig. 2 and Fig. 4 corresponding embodiment
At the quality information of each candidate push text.
Step 503, the candidate push text that corresponding quality information meets preset condition is chosen from candidate push text set
This, and the candidate push text that push is selected.
In this step, preset condition can be preset by technical staff according to application demand.For example, using specific
Numerical value when indicating quality information, preset condition can be quality information greater than preset threshold.
The method provided by the above embodiment of the disclosure is limited according to quality of the preset condition to text, thus to time
Candidate push text in choosing push text set is screened, and the candidate push text for not meeting preset condition is filtered out, can be with
The number of the text of push is efficiently reduced, is consumed so as to reduce the flow of terminal device for the text for receiving push.Together
When, this mode can also increase the exposure rate of the higher text of quality, also reduce the exposure rate of second-rate text.
With further reference to Fig. 6, as the realization to method shown in above-mentioned each figure, present disclose provides for generating information
One embodiment of device, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which specifically can be applied to
In various electronic equipments.
As shown in fig. 6, the device 600 provided in this embodiment for generating information includes acquiring unit 601, determination unit
602 and generation unit 603.Wherein, acquiring unit 601 is configured to obtain text to be processed;Acquiring unit 601 is further matched
It is set at least one comment information for obtaining text to be processed;Determination unit 602 be configured to determine text to be processed at least
The similarity of one comment information is as target similarity;Generation unit 603 is configured to be generated according to target similarity wait locate
Manage the quality information of text, wherein quality information is used to characterize the quality of text to be processed.
In the present embodiment, in the device 600 for generating information: acquiring unit 601, determination unit 602 and generation are single
The specific processing of member 603 and its brought technical effect can be respectively with reference to step 201, the steps 202 in Fig. 2 corresponding embodiment
With the related description of step 203, details are not described herein.
In some optional implementations of the present embodiment, above-mentioned determination unit is further configured to: being determined respectively
The similarity of comment information and text to be processed at least one comment information, obtains similarity collection;At least one is determined respectively
The weighted value of comment information in comment information;Determine that the weighted average for the similarity that similarity is concentrated is similar as target
Degree.
In some optional implementations of the present embodiment, above-mentioned determination unit is further configured to: will be to be processed
Text splits at least two Ziwen sheets;For the Ziwen sheet at least two Ziwen sheets, determine the Ziwen originally respectively and at least
The similarity of comment information in one comment information obtains the corresponding similarity collection of the Ziwen sheet;At least one is determined respectively
The weighted value of comment information in comment information;Determine the weighted average for the similarity that the corresponding similarity of the Ziwen sheet is concentrated
As the corresponding target similarity of the Ziwen sheet;It is similar according to the corresponding target of Ziwen sheet at least two Ziwen sheets
Degree, determines the corresponding target similarity of text to be processed.
In some optional implementations of the present embodiment, above-mentioned determination unit is further configured to: being determined at least
The average value of the corresponding target similarity of Ziwen sheet in two sub- texts or in which maximum value as text to be processed
Corresponding target similarity.
In some optional implementations of the present embodiment, above-mentioned determination unit is further configured to: at least
Comment information in one comment information, obtains the statistical information of the corresponding user's operation of the comment information;According to statistical information,
Determine the weight of the comment information.
The device provided by the above embodiment of the disclosure obtains text to be processed by acquiring unit, and obtains wait locate
Manage at least one comment information of text;Determination unit determines the similarity conduct of text to be processed and at least one comment information
Target similarity;Generation unit generates the quality information of text to be processed, wherein quality information is used for according to target similarity
The quality of text to be processed is characterized, to realize the similarity commented on using text with it to complete commenting to the quality of text
Estimate, also increase the number of features that can be used for characterizing text, this feature can also further be applied to the quality of text
In the relevant analysis of text and processing.
Below with reference to Fig. 7, it illustrates the electronic equipment that is suitable for being used to realize embodiment of the disclosure, (example is as shown in figure 1
Server) 700 structural schematic diagram.Server shown in Fig. 7 is only an example, should not be to the function of embodiment of the disclosure
Any restrictions can be brought with use scope.
As shown in fig. 7, electronic equipment 700 may include processing unit (such as central processing unit, graphics processor etc.)
701, random access can be loaded into according to the program being stored in read-only memory (ROM) 702 or from storage device 708
Program in memory (RAM) 703 and execute various movements appropriate and processing.In RAM 703, it is also stored with electronic equipment
Various programs and data needed for 700 operations.Processing unit 701, ROM 702 and RAM703 are connected with each other by bus 704.
Input/output (I/O) interface 705 is also connected to bus 704.
In general, following device can connect to I/O interface 705: including such as touch screen, touch tablet, keyboard, mouse, taking the photograph
As the input unit 706 of head, microphone, accelerometer, gyroscope etc.;Including such as liquid crystal display (LCD), loudspeaker, vibration
The output device 707 of dynamic device etc.;Storage device 708 including such as tape, hard disk etc.;And communication device 709.Communication device
709, which can permit electronic equipment 700, is wirelessly or non-wirelessly communicated with other equipment to exchange data.Although Fig. 7 shows tool
There is the electronic equipment 700 of various devices, it should be understood that being not required for implementing or having all devices shown.It can be with
Alternatively implement or have more or fewer devices.Each box shown in Fig. 7 can represent a device, can also root
According to needing to represent multiple devices.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium
On computer program, which includes the program code for method shown in execution flow chart.In such reality
It applies in example, which can be downloaded and installed from network by communication device 709, or from storage device 708
It is mounted, or is mounted from ROM 702.When the computer program is executed by processing unit 701, the implementation of the disclosure is executed
The above-mentioned function of being limited in the method for example.
It should be noted that computer-readable medium described in embodiment of the disclosure can be computer-readable signal
Medium or computer readable storage medium either the two any combination.Computer readable storage medium for example can be with
System, device or the device of --- but being not limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or it is any more than
Combination.The more specific example of computer readable storage medium can include but is not limited to: have one or more conducting wires
Electrical connection, portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type are programmable
Read-only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic are deposited
Memory device or above-mentioned any appropriate combination.In embodiment of the disclosure, computer readable storage medium, which can be, appoints
What include or the tangible medium of storage program that the program can be commanded execution system, device or device use or and its
It is used in combination.And in embodiment of the disclosure, computer-readable signal media may include in a base band or as carrier wave
The data-signal that a part is propagated, wherein carrying computer-readable program code.The data-signal of this propagation can be adopted
With diversified forms, including but not limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal is situated between
Matter can also be any computer-readable medium other than computer readable storage medium, which can be with
It sends, propagate or transmits for by the use of instruction execution system, device or device or program in connection.Meter
The program code for including on calculation machine readable medium can transmit with any suitable medium, including but not limited to: electric wire, optical cable,
RF (radio frequency) etc. or above-mentioned any appropriate combination.
Above-mentioned computer-readable medium can be included in above-mentioned server;It is also possible to individualism, and without
It is incorporated in the server.Above-mentioned computer-readable medium carries one or more program, when said one or multiple journeys
When sequence is executed by the server, so that the server: obtaining text to be processed;Obtain at least one comment letter of text to be processed
Breath;Determine the similarity of text to be processed and at least one comment information as target similarity;According to target similarity, generate
The quality information of text to be processed, wherein quality information is used to characterize the quality of text to be processed.
The behaviour for executing embodiment of the disclosure can be write with one or more programming languages or combinations thereof
The computer program code of work, programming language include object oriented program language-such as Java,
Smalltalk, C++ further include conventional procedural programming language-such as " C " language or similar program design language
Speech.Program code can be executed fully on the user computer, partly be executed on the user computer, as an independence
Software package execute, part on the user computer part execute on the remote computer or completely in remote computer or
It is executed on server.In situations involving remote computers, remote computer can pass through the network of any kind --- packet
Include local area network (LAN) or wide area network (WAN) --- it is connected to subscriber computer, or, it may be connected to outer computer (such as
It is connected using ISP by internet).
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use
The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box
The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually
It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse
Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding
The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction
Combination realize.
Being described in unit involved in embodiment of the disclosure can be realized by way of software, can also be passed through
The mode of hardware is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor
It include: acquiring unit, determination unit and generation unit.Wherein, the title of these units is not constituted to this under certain conditions
The restriction of unit itself, for example, acquiring unit is also described as " obtaining the unit of text to be processed ".
Above description is only the preferred embodiment of the disclosure and the explanation to institute's application technology principle.Those skilled in the art
Member it should be appreciated that embodiment of the disclosure involved in invention scope, however it is not limited to the specific combination of above-mentioned technical characteristic and
At technical solution, while should also cover do not depart from foregoing invention design in the case where, by above-mentioned technical characteristic or its be equal
Feature carries out any combination and other technical solutions for being formed.Such as disclosed in features described above and embodiment of the disclosure (but
It is not limited to) technical characteristic with similar functions is replaced mutually and the technical solution that is formed.