CN110362808A - Text analyzing method and device - Google Patents

Text analyzing method and device Download PDF

Info

Publication number
CN110362808A
CN110362808A CN201810252454.6A CN201810252454A CN110362808A CN 110362808 A CN110362808 A CN 110362808A CN 201810252454 A CN201810252454 A CN 201810252454A CN 110362808 A CN110362808 A CN 110362808A
Authority
CN
China
Prior art keywords
text
result
analysis model
module
analyzed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810252454.6A
Other languages
Chinese (zh)
Other versions
CN110362808B (en
Inventor
茅越
李明
蔡龙军
沈一
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Youku Network Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Youku Network Technology Beijing Co Ltd filed Critical Youku Network Technology Beijing Co Ltd
Priority to CN201810252454.6A priority Critical patent/CN110362808B/en
Publication of CN110362808A publication Critical patent/CN110362808A/en
Application granted granted Critical
Publication of CN110362808B publication Critical patent/CN110362808B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This disclosure relates to a kind of text analyzing method and device.This method comprises: obtaining characteristic information corresponding with multiple participles of text to be analyzed;Characteristic information is inputted in analysis model and is handled, obtains the text analyzing of text to be analyzed as a result, analysis model includes convolution module, relationship module and splicing output module.According to the embodiment of the present disclosure, characteristic information corresponding with multiple participles of text to be analyzed can be obtained, and characteristic information is inputted in analysis model and is handled to obtain text analyzing result, by realizing text analyzing using the analysis model for including convolution module, relationship module and splicing output module, to improve the accuracy of text analyzing result.

Description

Text analyzing method and device
Technical field
This disclosure relates to computer field more particularly to a kind of text analyzing method and device.
Background technique
As social networks, the continuous of mobile Internet are popularized, the cost of people's release information is lower and lower, more and more User be happy to share the viewpoint of oneself and the comment for personage, event, product on the internet.These comments reflect People have great significance for the analysis of public opinion and the prediction based on big data for the viewpoint and Sentiment orientation of things. Therefore, it is necessary to the comment texts of user to be analyzed and processed, with information such as the viewpoint and the Sentiment orientations that determine user.However, phase The accuracy for the analysis result analyzed in the technology of pass text is lower.
Summary of the invention
In view of this, can accurately obtain text analyzing result the present disclosure proposes a kind of text analyzing method.
According to the one side of the disclosure, a kind of text analyzing method is provided, comprising: obtain multiple with text to be analyzed Segment corresponding characteristic information;The characteristic information is inputted in analysis model and is handled, the text to be analyzed is obtained Text analyzing result, wherein the analysis model include convolution module, relationship module and splicing output module.
In a kind of possible implementation, the characteristic information is inputted in analysis model and be handled, obtain it is described to Analyze the text analyzing result of text, comprising:
The characteristic information is inputted in the convolution module and is handled, convolution results are obtained;
The convolution results are inputted in the relationship module and are handled, relational result is obtained;
The relational result is inputted in splicing output module and is handled, the text analyzing of the text to be analyzed is obtained As a result.
In a kind of possible implementation, characteristic information corresponding with multiple participles of text to be analyzed is obtained, comprising:
Vectorization processing is carried out to multiple participles of the text to be analyzed respectively, is obtained corresponding with the multiple participle Multiple vector informations;
According to the multiple vector information, the characteristic information of the multiple participle is determined.
In a kind of possible implementation, the splicing output module includes multiple full articulamentums and softmax process layer,
Wherein, the relational result is inputted in splicing output module and is handled, obtain the text of the text to be analyzed This analysis is as a result, include:
Vector splicing is carried out to the relational result, obtains spliced vector information;
By the spliced vector information sequentially input in the multiple full articulamentum and the softmax process layer into Row processing, obtains the text analyzing result of the text to be analyzed.
In a kind of possible implementation, the method also includes:
Obtain the corresponding training characteristics information of multiple participles of sample text;
It will be handled in the training characteristics information input initial analysis model, obtain the training point of the sample text Analyse result, wherein the initial analysis model includes initial convolution module, initial relation module and initial splicing output mould Block;
According to the training analysis result and the annotation results of the sample text, the mould of the initial analysis model is determined Type loss;
It is lost according to the model, adjusts the parameter weight in the initial analysis model, determine analysis mould adjusted Type;
In the case where model loss meets training condition, analysis model adjusted is determined as to final analysis Model.
In a kind of possible implementation, the convolution module includes convolutional neural networks, and the relationship module includes closing It is network.
According to another aspect of the present disclosure, a kind of text analyzing device is provided, comprising:
Feature acquiring unit, for obtaining characteristic information corresponding with multiple participles of text to be analyzed;
As a result acquiring unit is handled for inputting the characteristic information in analysis model, is obtained described to be analyzed The text analyzing of text as a result,
Wherein, the analysis model includes convolution module, relationship module and splicing output module.
In a kind of possible implementation, the result acquiring unit includes:
First result obtains subelement, handles for inputting the characteristic information in the convolution module, obtains Convolution results;
Second result obtains subelement, handles for inputting the convolution results in the relationship module, obtains Relational result;
Third result obtains subelement, handles, obtains for inputting the relational result in splicing output module The text analyzing result of the text to be analyzed.
In a kind of possible implementation, the feature acquiring unit includes:
Vectorization subelement carries out vectorization processing for multiple participles to the text to be analyzed respectively, obtain with It is the multiple to segment corresponding multiple vector informations;
Feature determines subelement, for determining the characteristic information of the multiple participle according to the multiple vector information.
In a kind of possible implementation, the splicing output module includes multiple full articulamentums and softmax process layer, Wherein, the third result acquisition subelement includes:
Splice subelement, for carrying out vector splicing to the relational result, obtains spliced vector information;
Information processing subelement, for the spliced vector information to be sequentially input the multiple full articulamentum and institute It states and is handled in softmax process layer, obtain the text analyzing result of the text to be analyzed.
In a kind of possible implementation, described device further include:
Training characteristics acquiring unit, the corresponding training characteristics information of multiple participles for obtaining sample text;
Training result acquiring unit is obtained for will handle in the training characteristics information input initial analysis model Take the training analysis result of the sample text, wherein the initial analysis model includes initial convolution module, initial relation mould Block and initially splice output module;
It loses determination unit and determines institute for the annotation results according to the training analysis result and the sample text State the model loss of initial analysis model;
Model adjustment unit adjusts the parameter weight in the initial analysis model, really for losing according to the model Fixed analysis model adjusted;
Model determination unit, for the model loss meet training condition in the case where, by analysis mould adjusted Type is determined as final analysis model.
In a kind of possible implementation, the convolution module includes convolutional neural networks, and the relationship module includes closing It is network.
According to another aspect of the present disclosure, a kind of viewpoint extraction element is provided, comprising: processor;It is handled for storage The memory of device executable instruction;Wherein, the processor is configured to executing the above method.
According to another aspect of the present disclosure, a kind of non-volatile computer readable storage medium storing program for executing is provided, is stored thereon with Computer program instructions, wherein the computer program instructions realize above-mentioned viewpoint extracting method when being executed by processor.
According to the embodiment of the present disclosure, characteristic information corresponding with multiple participles of text to be analyzed can be obtained, and will be special Levy information input analysis model in processing to obtain text analyzing as a result, by using include convolution module, relationship module and The analysis model for splicing output module realizes text analyzing, to improve the accuracy of text analyzing result.
According to below with reference to the accompanying drawings to detailed description of illustrative embodiments, the other feature and aspect of the disclosure will become It is clear.
Detailed description of the invention
Comprising in the description and constituting the attached drawing of part of specification and specification together illustrates the disclosure Exemplary embodiment, feature and aspect, and for explaining the principles of this disclosure.
Fig. 1 is a kind of flow chart of text analyzing method shown according to an exemplary embodiment.
Fig. 2 is the flow chart of the step S11 of text analyzing method shown according to an exemplary embodiment a kind of.
Fig. 3 is a kind of schematic diagram of the analysis model of text analyzing method shown according to an exemplary embodiment.
Fig. 4 is the flow chart of the step S12 of text analyzing method shown according to an exemplary embodiment a kind of.
Fig. 5 is a kind of flow chart of text analyzing method shown according to an exemplary embodiment.
Fig. 6 is a kind of block diagram of text analyzing device shown according to an exemplary embodiment.
Fig. 7 is a kind of block diagram of text analyzing device shown according to an exemplary embodiment.
Fig. 8 is a kind of block diagram of text analyzing device shown according to an exemplary embodiment.
Specific embodiment
Various exemplary embodiments, feature and the aspect of the disclosure are described in detail below with reference to attached drawing.It is identical in attached drawing Appended drawing reference indicate element functionally identical or similar.Although the various aspects of embodiment are shown in the attached drawings, remove It non-specifically points out, it is not necessary to attached drawing drawn to scale.
Dedicated word " exemplary " means " being used as example, embodiment or illustrative " herein.Here as " exemplary " Illustrated any embodiment should not necessarily be construed as preferred or advantageous over other embodiments.
In addition, giving numerous details in specific embodiment below to better illustrate the disclosure. It will be appreciated by those skilled in the art that without certain details, the disclosure equally be can be implemented.In some instances, for Method, means, element and circuit well known to those skilled in the art are not described in detail, in order to highlight the purport of the disclosure.
Fig. 1 is a kind of flow chart of text analyzing method shown according to an exemplary embodiment.This method can be applied to In server.As shown in Figure 1, the text analyzing method according to the embodiment of the present disclosure includes:
In step s 11, characteristic information corresponding with multiple participles of text to be analyzed is obtained;
In step s 12, the characteristic information is inputted in analysis model and is handled, obtain the text to be analyzed Text analyzing as a result,
Wherein, the analysis model includes convolution module, relationship module and splicing output module.
According to the embodiment of the present disclosure, characteristic information corresponding with multiple participles of text to be analyzed can be obtained, and will Characteristic information input analysis model in processing to obtain text analyzing as a result, by using include convolution module, relationship module with And the analysis model of splicing output module realizes text analyzing, to improve the accuracy of text analyzing result.According to this public affairs Open embodiment, can help business personnel understand user to the comment angle of the comment information (text to be analyzed) of certain an object and Attitude etc. is passed judgement on, the value of comment information (text to be analyzed) is sufficiently excavated.
For example, text to be analyzed may include the comment text that user is directed to certain an object.The object can refer to energy Any object of comment and analysis is enough carried out, for example, can be video, audio, news, personage, event or product etc..
In one possible implementation, before the comment text to user segments, can to comment text into Row pretreatment, to improve the accuracy and efficiency of analysis.Wherein, comment text is pre-processed can include: delete in comment text Designated character (such as forwarding character in the comment such as can delete microblogging), the complex form of Chinese characters in comment text is converted into simplified Chinese character Deng.After pretreatment, it may be determined that text to be analyzed.
In one possible implementation, the participle mode that can use the relevant technologies, divides text to be analyzed Word processing.For example, neologisms phrase can be extracted from all comment texts for certain object, and using the neologisms phrase as needle To the participle dictionary of the object.The participle dictionary can be used, word segmentation processing is carried out to text to be analyzed, to obtain text to be analyzed This multiple participles.Wherein, the quantity of participle is less than or equal to the quantity N of the accessible characteristic information of analysis model, that is, point Quantity≤N of word.The disclosure to the concrete modes of the multiple participles for obtaining text to be analyzed with no restriction.
Fig. 2 is the flow chart of the step S11 of text analyzing method shown according to an exemplary embodiment a kind of.Such as Fig. 2 It is shown, in one possible implementation, step S11 can include:
In step S111, vectorization processing carried out respectively to multiple participles of the text to be analyzed, obtain with it is described It is multiple to segment corresponding multiple vector informations;
In step S112, according to the multiple vector information, the characteristic information is determined.
For example, it can be analysed to using the mapping model (such as Google's word2vector model etc.) of pre-training It is multiple vector informations namely multiple real number row vectors that multiple participles of text convert (mapping) respectively.Wherein, when text to be analyzed , can be neat by remaining position spot patch when this participle quantity < N, so that the total quantity of vector information is N number of.It can will obtain N number of vector information be determined as N number of characteristic information.In this way, available to be input in analysis model N number of characteristic information of reason.
Fig. 3 is a kind of schematic diagram of the analysis model of text analyzing method shown according to an exemplary embodiment.Such as Fig. 3 Shown, the analysis model includes convolution module 31, relationship module 32 and splicing output module 33.
Fig. 4 is the flow chart of the step S12 of text analyzing method shown according to an exemplary embodiment a kind of.Such as Fig. 4 It is shown, in one possible implementation, step S12 can include:
In step S121, the characteristic information is inputted in the convolution module and is handled, obtains convolution results;
In step S122, the convolution results are inputted in the relationship module and are handled, obtains relational result;
In step S123, the relational result is inputted in splicing output module and is handled, is obtained described to be analyzed The text analyzing result of text.
For example, convolution module 31 may include one or more convolutional neural networks.Convolutional neural networks can be effectively Capture the contextual information of sentence part.
For example, for N number of characteristic information (vector information) of text to be analyzed, if each vector information is k dimension Real number row vector namely length are k (k > 1), then N number of characteristic information may make up the matrix of N row k column.The N row k can be arranged Input matrix handled to convolution module 31.
D different weight can be used in convolution module 31, size is the matrix point that the convolution kernel of (h, k) arranges above-mentioned N row k Not carry out convolution operation, with extract it is continuous h participle local message.After multiple convolution operates, available d N-h+ The column vector of 1 dimension, constitutes the real number matrix (convolution results) of N-h+1 row d column.Wherein, each column in the real number matrix can be right Answer the operation of each convolution kernel as a result, every a line can correspond to the local message of text to be analyzed.
In one possible implementation, convolution module 31 may include multiple convolutional neural networks, multiple convolutional Neurals Network carries out process of convolution to N number of characteristic information respectively using different convolution kernels (h, k), makees so that multiple real number matrix will be obtained For convolution results.For example, the convolution kernel of h=2,3,4 is respectively adopted.In such manner, it is possible to obtain the different sizes of text to be analyzed (even Continuous h participle) local message, to be analyzed and processed to various sizes of local message.
It should be appreciated that those skilled in the art can choose convolutional neural networks according to actual needs, and set convolutional Neural The parameters such as the weight quantity and convolution kernel size of network, the disclosure to this with no restriction.
In one possible implementation, convolution results can be inputted in relationship module 32 in step S122 Reason obtains relational result.Wherein, relationship module 32 may include one or more relational networks (relation networks, RN).Relational network can be used for capturing remote dependence between the participle of text to be analyzed, extracts any two and locally believes Relation information between breath.
For example, convolution results can be input in relationship module 32 and is handled.M=N-h+1 is enabled, then convolution results It can be the real number matrix of one or more M row d column.For each real number matrix, every a line (namely M d ties up real vector o1、o2、…、oM) it can indicate the local message of text to be analyzed.In relationship module 32, multi-layer perception (MLP) b can be used to express Relationship namely relation vector b (o between any two local messageq, ol), wherein 1≤q < l≤M.To all M (M-1)/2 A relation vector b (oq, ol) be averaging, and result is input in another multi-layer perception (MLP) f and is handled, relationship can be obtained Vector r.As shown in formula (1):
In the case where convolution results are one or more real number matrix, relationship module 32 may include respectively to convolution results The one or more relational networks handled, to obtain one or more relation vector r and by the one or more relationship Vector r is as final relational result.
It should be appreciated that those skilled in the art can choose relational network and multi-layer perception (MLP) b and f according to actual needs, The disclosure to this with no restriction.In this way, the available relational result handled through relationship module 32.
In one possible implementation, can in step S123 by relational result input splicing output module 33 in into Row processing, obtains the text analyzing result of text to be analyzed.
In one possible implementation, splicing output module 33 may include multiple full articulamentums and softmax processing Layer, wherein step S123 can include:
Vector splicing is carried out to the relational result, obtains spliced vector information;
By the spliced vector information sequentially input in the multiple full articulamentum and the softmax process layer into Row processing, obtains the text analyzing result of the text to be analyzed.
For example, multiple vector informations of relational result can be spliced, obtains spliced vector information (length For the sum of the length of multiple vector informations of relational result).By spliced vector information sequentially input multiple full articulamentums and It is handled in softmax process layer, can get the text analyzing result of text to be analyzed.It should be appreciated that those skilled in the art Member can choose full articulamentum and softmax process layer according to actual needs, the disclosure to this with no restriction.
In in accordance with an embodiment of the present disclosure, characteristic information is being handled by analysis model to obtain text to be analyzed Text analyzing result before, initial analysis model can be trained.
Fig. 5 is a kind of flow chart of text analyzing method shown according to an exemplary embodiment.As shown in figure 5, one In the possible implementation of kind, this method further include:
In step s 13, the corresponding training characteristics information of multiple participles of sample text is obtained;
In step S14, it will be handled in the training characteristics information input initial analysis model, obtain the sample The training analysis result of text, wherein the initial analysis model includes initial convolution module, initial relation module and initial Splice output module;
In step S15, according to the training analysis result and the annotation results of the sample text, determine described initial The model of analysis model loses;
In step s 16, it is lost according to the model, adjusts the parameter weight in the initial analysis model, determined and adjust Analysis model after whole;
In step S17, in the case where model loss meets training condition, analysis model adjusted is determined For final analysis model.
For example, manual analysis can be carried out to existing comment text and marks the analysis result (mark of sample text Infuse result), form training set.For any one sample text in training set, sample text can be pre-processed, and adopt With the participle mode of the relevant technologies, word segmentation processing is carried out to sample text, obtains multiple participles of sample text.Wherein, it segments Quantity be less than or equal to the accessible characteristic information of analysis model quantity N, that is, participle quantity≤N.
It in one possible implementation, can be using mapping model (such as Google's word2vector mould of pre-training Type etc.) multiple participles of sample text are each mapped to multiple vector informations.Wherein, it when segmenting quantity < N, can will remain Remaining position spot patch is neat so that the total quantity of vector information be it is N number of, by N number of vector information of acquisition be determined as sample text This training characteristics information (N number of characteristic information).
In one possible implementation, it can will handle, obtain in training characteristics information input initial analysis model Take the training analysis result of sample text, wherein initial analysis model include initial convolution module, initial relation module and just Begin splicing output module.Wherein, the structure and form of the modules of initial analysis model can be as it was noted above, no longer superfluous herein It states.
In one possible implementation, it according to training analysis result and the annotation results of sample text, determines initial The model of analysis model loses.Wherein, the concrete type of the loss function of model loss can be by those skilled in the art according to reality Border situation choose, the disclosure to this with no restriction.
In one possible implementation, it is lost according to the model of initial analysis model, adjustable initial analysis mould Parameter weight in type, determines analysis model adjusted.For example, back-propagation algorithm can be used, for example, BPTT (Back Propagation Through Time) algorithm, loses based on this model, seeks ladder to the parameter weight of the initial analysis model It spends, and adjusts the parameter weight in initial analysis model based on the gradient.
In one possible implementation, can the model of above steps may be repeated multiple times S14-S16 adjust process.Wherein, It can be preset with training condition, which may include repetitive exercise number and/or condition of convergence of setting of setting etc.. When model loss meets training condition, it is believed that last time analysis model adjusted can satisfy accuracy requirement, can incite somebody to action The analysis model adjusted is determined as final analysis model.
In this way, it is trained, can be obtained with initial analysis model according to the training characteristics information of sample text To the analysis model for meeting training condition, so that analysis model can accurately extract viewpoint and feelings in text to be analyzed Sense tendency.
In accordance with an embodiment of the present disclosure, characteristic information corresponding with multiple participles of text to be analyzed can be obtained, and Characteristic information is inputted in analysis model and is handled to obtain text analyzing as a result, by using including convolution module, relationship module And the analysis model of splicing output module realizes text analyzing, to improve the accuracy of text analyzing result.According to this Open embodiment can help business personnel to understand user to the comment angle of the comment information (text to be analyzed) of certain an object With pass judgement on attitude etc., sufficiently excavate the value of comment information (text to be analyzed).
Fig. 6 is a kind of block diagram of text analyzing device shown according to an exemplary embodiment.As shown in fig. 6, the text This analytical equipment includes:
Feature acquiring unit 71, for obtaining characteristic information corresponding with multiple participles of text to be analyzed;
As a result acquiring unit 72 are handled for inputting the characteristic information in analysis model, are obtained described wait divide Analyse text text analyzing as a result,
Wherein, the analysis model includes convolution module, relationship module and splicing output module.
Fig. 7 is a kind of block diagram of text analyzing device shown according to an exemplary embodiment.As shown in fig. 7, in one kind In possible implementation, the result acquiring unit 72 can include:
First result obtains subelement 721, handles, obtains for inputting the characteristic information in the convolution module Take convolution results;
Second result obtains subelement 722, handles, obtains for inputting the convolution results in the relationship module Take relational result;
Third result obtains subelement 723, handles, obtains for inputting the relational result in splicing output module Take the text analyzing result of the text to be analyzed.
As shown in fig. 7, in a kind of possible implementation, the feature acquiring unit 71 can include:
Vectorization subelement 711 carries out vectorization processing for multiple participles to the text to be analyzed respectively, obtains Multiple vector informations corresponding with the multiple participle;
Feature determines subelement 712, for determining the feature letter of the multiple participle according to the multiple vector information Breath.
In a kind of possible implementation, the splicing output module includes multiple full articulamentums and softmax process layer, Wherein, the third result obtains subelement 723 can include:
Splice subelement, for carrying out vector splicing to the relational result, obtains spliced vector information;
Information processing subelement, for the spliced vector information to be sequentially input the multiple full articulamentum and institute It states and is handled in softmax process layer, obtain the text analyzing result of the text to be analyzed.
As shown in fig. 7, described device may also include that in a kind of possible implementation
Training characteristics acquiring unit 73, the corresponding training characteristics information of multiple participles for obtaining sample text;
Training result acquiring unit 74, for will be handled in the training characteristics information input initial analysis model, Obtain the training analysis result of the sample text, wherein the initial analysis model includes initial convolution module, initial relation Module and initially splice output module;
Determination unit 75 is lost, for the annotation results according to the training analysis result and the sample text, is determined The model of the initial analysis model loses;
Model adjustment unit 76, for adjusting the parameter weight in the initial analysis model according to model loss, Determine analysis model adjusted;
Model determination unit 77, for the model loss meet training condition in the case where, by analysis adjusted Model is determined as final analysis model.
In a kind of possible implementation, the convolution module includes convolutional neural networks, and the relationship module includes closing It is network.
Fig. 8 is a kind of block diagram of text analyzing device 1900 shown according to an exemplary embodiment.For example, device 1900 It may be provided as a server.Referring to Fig. 8, it further comprises one or more that device 1900, which includes processing component 1922, Processor and memory resource represented by a memory 1932, can be by the finger of the execution of processing component 1922 for storing It enables, such as application program.The application program stored in memory 1932 may include each one or more correspondence In the module of one group of instruction.In addition, processing component 1922 is configured as executing instruction, to execute the above method.
Device 1900 can also include that a power supply module 1926 be configured as the power management of executive device 1900, and one Wired or wireless network interface 1950 is configured as device 1900 being connected to network and input and output (I/O) interface 1958.Device 1900 can be operated based on the operating system for being stored in memory 1932, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or similar.
In the exemplary embodiment, a kind of non-volatile computer readable storage medium storing program for executing is additionally provided, for example including calculating The memory 1932 of machine program instruction, above-mentioned computer program instructions can be executed by the processing component 1922 of device 1900 to complete The above method.
The disclosure can be system, method and/or computer program product.Computer program product may include computer Readable storage medium storing program for executing, containing for making processor realize the computer-readable program instructions of various aspects of the disclosure.
Computer readable storage medium, which can be, can keep and store the tangible of the instruction used by instruction execution equipment Equipment.Computer readable storage medium for example can be-- but it is not limited to-- storage device electric, magnetic storage apparatus, optical storage Equipment, electric magnetic storage apparatus, semiconductor memory apparatus or above-mentioned any appropriate combination.Computer readable storage medium More specific example (non exhaustive list) includes: portable computer diskette, hard disk, random access memory (RAM), read-only deposits It is reservoir (ROM), erasable programmable read only memory (EPROM or flash memory), static random access memory (SRAM), portable Compact disk read-only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanical coding equipment, for example thereon It is stored with punch card or groove internal projection structure and the above-mentioned any appropriate combination of instruction.Calculating used herein above Machine readable storage medium storing program for executing is not interpreted that instantaneous signal itself, the electromagnetic wave of such as radio wave or other Free propagations lead to It crosses the electromagnetic wave (for example, the light pulse for passing through fiber optic cables) of waveguide or the propagation of other transmission mediums or is transmitted by electric wire Electric signal.
Computer-readable program instructions as described herein can be downloaded to from computer readable storage medium it is each calculate/ Processing equipment, or outer computer or outer is downloaded to by network, such as internet, local area network, wide area network and/or wireless network Portion stores equipment.Network may include copper transmission cable, optical fiber transmission, wireless transmission, router, firewall, interchanger, gateway Computer and/or Edge Server.Adapter or network interface in each calculating/processing equipment are received from network to be counted Calculation machine readable program instructions, and the computer-readable program instructions are forwarded, for the meter being stored in each calculating/processing equipment In calculation machine readable storage medium storing program for executing.
Computer program instructions for executing disclosure operation can be assembly instruction, instruction set architecture (ISA) instructs, Machine instruction, machine-dependent instructions, microcode, firmware instructions, condition setup data or with one or more programming languages The source code or object code that any combination is write, the programming language include the programming language-of object-oriented such as Smalltalk, C++ etc., and conventional procedural programming languages-such as " C " language or similar programming language.Computer Readable program instructions can be executed fully on the user computer, partly execute on the user computer, be only as one Vertical software package executes, part executes on the remote computer or completely in remote computer on the user computer for part Or it is executed on server.In situations involving remote computers, remote computer can pass through network-packet of any kind It includes local area network (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as benefit It is connected with ISP by internet).In some embodiments, by utilizing computer-readable program instructions Status information carry out personalized customization electronic circuit, such as programmable logic circuit, field programmable gate array (FPGA) or can Programmed logic array (PLA) (PLA), the electronic circuit can execute computer-readable program instructions, to realize each side of the disclosure Face.
Referring herein to according to the flow chart of the method, apparatus (system) of the embodiment of the present disclosure and computer program product and/ Or block diagram describes various aspects of the disclosure.It should be appreciated that flowchart and or block diagram each box and flow chart and/ Or in block diagram each box combination, can be realized by computer-readable program instructions.
These computer-readable program instructions can be supplied to general purpose computer, special purpose computer or other programmable datas The processor of processing unit, so that a kind of machine is produced, so that these instructions are passing through computer or other programmable datas When the processor of processing unit executes, function specified in one or more boxes in implementation flow chart and/or block diagram is produced The device of energy/movement.These computer-readable program instructions can also be stored in a computer-readable storage medium, these refer to It enables so that computer, programmable data processing unit and/or other equipment work in a specific way, thus, it is stored with instruction Computer-readable medium then includes a manufacture comprising in one or more boxes in implementation flow chart and/or block diagram The instruction of the various aspects of defined function action.
Computer-readable program instructions can also be loaded into computer, other programmable data processing units or other In equipment, so that series of operation steps are executed in computer, other programmable data processing units or other equipment, to produce Raw computer implemented process, so that executed in computer, other programmable data processing units or other equipment Instruct function action specified in one or more boxes in implementation flow chart and/or block diagram.
The flow chart and block diagram in the drawings show system, method and the computer journeys according to multiple embodiments of the disclosure The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation One module of table, program segment or a part of instruction, the module, program segment or a part of instruction include one or more use The executable instruction of the logic function as defined in realizing.In some implementations as replacements, function marked in the box It can occur in a different order than that indicated in the drawings.For example, two continuous boxes can actually be held substantially in parallel Row, they can also be executed in the opposite order sometimes, and this depends on the function involved.It is also noted that block diagram and/or The combination of each box in flow chart and the box in block diagram and or flow chart, can the function as defined in executing or dynamic The dedicated hardware based system made is realized, or can be realized using a combination of dedicated hardware and computer instructions.
The presently disclosed embodiments is described above, above description is exemplary, and non-exclusive, and It is not limited to disclosed each embodiment.Without departing from the scope and spirit of illustrated each embodiment, for this skill Many modifications and changes are obvious for the those of ordinary skill in art field.The selection of term used herein, purport In the principle, practical application or technological improvement to the technology in market for best explaining each embodiment, or lead this technology Other those of ordinary skill in domain can understand each embodiment disclosed herein.

Claims (14)

1. a kind of text analyzing method characterized by comprising
Obtain characteristic information corresponding with multiple participles of text to be analyzed;
The characteristic information is inputted in analysis model and be handled, obtain the text analyzing of the text to be analyzed as a result,
Wherein, the analysis model includes convolution module, relationship module and splicing output module.
2. the method according to claim 1, wherein the characteristic information is inputted in analysis model Reason obtains the text analyzing result of the text to be analyzed, comprising:
The characteristic information is inputted in the convolution module and is handled, convolution results are obtained;
The convolution results are inputted in the relationship module and are handled, relational result is obtained;
The relational result is inputted in splicing output module and is handled, the text analyzing knot of the text to be analyzed is obtained Fruit.
3. the method according to claim 1, wherein obtaining spy corresponding with multiple participles of text to be analyzed Reference breath, comprising:
Vectorization processing is carried out to multiple participles of the text to be analyzed respectively, is obtained corresponding more with the multiple participle A vector information;
According to the multiple vector information, the characteristic information of the multiple participle is determined.
4. according to the method described in claim 2, it is characterized in that, the splicing output module include multiple full articulamentums and Softmax process layer,
Wherein, the relational result is inputted in splicing output module and is handled, obtain the text point of the text to be analyzed Analyse result, comprising:
Vector splicing is carried out to the relational result, obtains spliced vector information;
The spliced vector information is sequentially input in the multiple full articulamentum and the softmax process layer Reason obtains the text analyzing result of the text to be analyzed.
5. the method according to claim 1, wherein the method also includes:
Obtain the corresponding training characteristics information of multiple participles of sample text;
It will be handled in the training characteristics information input initial analysis model, obtain the training analysis knot of the sample text Fruit, wherein the initial analysis model includes initial convolution module, initial relation module and initially splices output module;
According to the training analysis result and the annotation results of the sample text, the model damage of the initial analysis model is determined It loses;
It is lost according to the model, adjusts the parameter weight in the initial analysis model, determine analysis model adjusted;
In the case where model loss meets training condition, analysis model adjusted is determined as to final analysis mould Type.
6. method as claimed in any of claims 1 to 5, which is characterized in that the convolution module includes convolution mind Through network, the relationship module includes relational network.
7. a kind of text analyzing device characterized by comprising
Feature acquiring unit, for obtaining characteristic information corresponding with multiple participles of text to be analyzed;
As a result acquiring unit is handled for inputting the characteristic information in analysis model, obtains the text to be analyzed Text analyzing as a result,
Wherein, the analysis model includes convolution module, relationship module and splicing output module.
8. device according to claim 7, which is characterized in that the result acquiring unit includes:
First result obtains subelement, handles for inputting the characteristic information in the convolution module, obtains convolution As a result;
Second result obtains subelement, handles for inputting the convolution results in the relationship module, obtains relationship As a result;
Third result obtains subelement, handles for inputting the relational result in splicing output module, described in acquisition The text analyzing result of text to be analyzed.
9. device according to claim 7, which is characterized in that the feature acquiring unit includes:
Vectorization subelement carries out vectorization processing for multiple participles to the text to be analyzed respectively, obtain with it is described It is multiple to segment corresponding multiple vector informations;
Feature determines subelement, for determining the characteristic information of the multiple participle according to the multiple vector information.
10. device according to claim 8, which is characterized in that the splicing output module include multiple full articulamentums and Softmax process layer,
Wherein, the third result acquisition subelement includes:
Splice subelement, for carrying out vector splicing to the relational result, obtains spliced vector information;
Information processing subelement, for the spliced vector information to be sequentially input the multiple full articulamentum and described It is handled in softmax process layer, obtains the text analyzing result of the text to be analyzed.
11. device according to claim 7, which is characterized in that described device further include:
Training characteristics acquiring unit, the corresponding training characteristics information of multiple participles for obtaining sample text;
Training result acquiring unit obtains institute for will handle in the training characteristics information input initial analysis model State the training analysis result of sample text, wherein the initial analysis model include initial convolution module, initial relation module with And initially splice output module;
Determination unit is lost, for the annotation results according to the training analysis result and the sample text, is determined described first The model of beginning analysis model loses;
Model adjustment unit adjusts the parameter weight in the initial analysis model, determines and adjust for being lost according to the model Analysis model after whole;
Model determination unit, for the model loss meet training condition in the case where, analysis model adjusted is true It is set to final analysis model.
12. the device according to any one of claim 7 to 11, which is characterized in that the convolution module includes convolution Neural network, the relationship module include relational network.
13. a kind of text analyzing device characterized by comprising
Processor;
Memory for storage processor executable instruction;
Wherein, the processor is configured to method described in any one of perform claim requirement 1 to 6.
14. a kind of non-volatile computer readable storage medium storing program for executing, is stored thereon with computer program instructions, which is characterized in that institute It states and realizes method described in any one of claim 1 to 6 when computer program instructions are executed by processor.
CN201810252454.6A 2018-03-26 2018-03-26 Text analysis method and device Active CN110362808B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810252454.6A CN110362808B (en) 2018-03-26 2018-03-26 Text analysis method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810252454.6A CN110362808B (en) 2018-03-26 2018-03-26 Text analysis method and device

Publications (2)

Publication Number Publication Date
CN110362808A true CN110362808A (en) 2019-10-22
CN110362808B CN110362808B (en) 2022-06-14

Family

ID=68212093

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810252454.6A Active CN110362808B (en) 2018-03-26 2018-03-26 Text analysis method and device

Country Status (1)

Country Link
CN (1) CN110362808B (en)

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070179776A1 (en) * 2006-01-27 2007-08-02 Xerox Corporation Linguistic user interface
US20110225155A1 (en) * 2010-03-10 2011-09-15 Xerox Corporation System and method for guiding entity-based searching
CN106372058A (en) * 2016-08-29 2017-02-01 中译语通科技(北京)有限公司 Short text emotion factor extraction method and device based on deep learning
CN106599933A (en) * 2016-12-26 2017-04-26 哈尔滨工业大学 Text emotion classification method based on the joint deep learning model
US20170124432A1 (en) * 2015-11-03 2017-05-04 Baidu Usa Llc Systems and methods for attention-based configurable convolutional neural networks (abc-cnn) for visual question answering
CN106951438A (en) * 2017-02-13 2017-07-14 北京航空航天大学 A kind of event extraction system and method towards open field
CN107092596A (en) * 2017-04-24 2017-08-25 重庆邮电大学 Text emotion analysis method based on attention CNNs and CCR
CN107180247A (en) * 2017-05-19 2017-09-19 中国人民解放军国防科学技术大学 Relation grader and its method based on selective attention convolutional neural networks
CN107247702A (en) * 2017-05-05 2017-10-13 桂林电子科技大学 A kind of text emotion analysis and processing method and system
CN107341145A (en) * 2017-06-21 2017-11-10 华中科技大学 A kind of user feeling analysis method based on deep learning
CN107391709A (en) * 2017-07-28 2017-11-24 深圳市唯特视科技有限公司 A kind of method that image captions generation is carried out based on new attention model
CN107391483A (en) * 2017-07-13 2017-11-24 武汉大学 A kind of comment on commodity data sensibility classification method based on convolutional neural networks
CN107491531A (en) * 2017-08-18 2017-12-19 华南师范大学 Chinese network comment sensibility classification method based on integrated study framework
CN107515855A (en) * 2017-08-18 2017-12-26 武汉红茶数据技术有限公司 The microblog emotional analysis method and system of a kind of combination emoticon
CN107526725A (en) * 2017-09-04 2017-12-29 北京百度网讯科技有限公司 The method and apparatus for generating text based on artificial intelligence
CN107609009A (en) * 2017-07-26 2018-01-19 北京大学深圳研究院 Text emotion analysis method, device, storage medium and computer equipment
CN107608956A (en) * 2017-09-05 2018-01-19 广东石油化工学院 A kind of reader's mood forecast of distribution algorithm based on CNN GRNN

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070179776A1 (en) * 2006-01-27 2007-08-02 Xerox Corporation Linguistic user interface
US20110225155A1 (en) * 2010-03-10 2011-09-15 Xerox Corporation System and method for guiding entity-based searching
US20170124432A1 (en) * 2015-11-03 2017-05-04 Baidu Usa Llc Systems and methods for attention-based configurable convolutional neural networks (abc-cnn) for visual question answering
CN106372058A (en) * 2016-08-29 2017-02-01 中译语通科技(北京)有限公司 Short text emotion factor extraction method and device based on deep learning
CN106599933A (en) * 2016-12-26 2017-04-26 哈尔滨工业大学 Text emotion classification method based on the joint deep learning model
CN106951438A (en) * 2017-02-13 2017-07-14 北京航空航天大学 A kind of event extraction system and method towards open field
CN107092596A (en) * 2017-04-24 2017-08-25 重庆邮电大学 Text emotion analysis method based on attention CNNs and CCR
CN107247702A (en) * 2017-05-05 2017-10-13 桂林电子科技大学 A kind of text emotion analysis and processing method and system
CN107180247A (en) * 2017-05-19 2017-09-19 中国人民解放军国防科学技术大学 Relation grader and its method based on selective attention convolutional neural networks
CN107341145A (en) * 2017-06-21 2017-11-10 华中科技大学 A kind of user feeling analysis method based on deep learning
CN107391483A (en) * 2017-07-13 2017-11-24 武汉大学 A kind of comment on commodity data sensibility classification method based on convolutional neural networks
CN107609009A (en) * 2017-07-26 2018-01-19 北京大学深圳研究院 Text emotion analysis method, device, storage medium and computer equipment
CN107391709A (en) * 2017-07-28 2017-11-24 深圳市唯特视科技有限公司 A kind of method that image captions generation is carried out based on new attention model
CN107491531A (en) * 2017-08-18 2017-12-19 华南师范大学 Chinese network comment sensibility classification method based on integrated study framework
CN107515855A (en) * 2017-08-18 2017-12-26 武汉红茶数据技术有限公司 The microblog emotional analysis method and system of a kind of combination emoticon
CN107526725A (en) * 2017-09-04 2017-12-29 北京百度网讯科技有限公司 The method and apparatus for generating text based on artificial intelligence
CN107608956A (en) * 2017-09-05 2018-01-19 广东石油化工学院 A kind of reader's mood forecast of distribution algorithm based on CNN GRNN

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
万圣贤等: "用于文本分类的局部化双向长短时记忆", 《中文信息学报》 *
梁斌等: "基于多注意力卷积神经网络的特定目标情感分析", 《计算机研究与发展》 *

Also Published As

Publication number Publication date
CN110362808B (en) 2022-06-14

Similar Documents

Publication Publication Date Title
JP7122341B2 (en) Method and apparatus for evaluating translation quality
CN111026842B (en) Natural language processing method, natural language processing device and intelligent question-answering system
CN108288078B (en) Method, device and medium for recognizing characters in image
CN108536679B (en) Named entity recognition method, device, equipment and computer readable storage medium
CN109214386B (en) Method and apparatus for generating image recognition model
CN116194912A (en) Method and system for aspect-level emotion classification using graph diffusion transducers
CN110717514A (en) Session intention identification method and device, computer equipment and storage medium
CN114694076A (en) Multi-modal emotion analysis method based on multi-task learning and stacked cross-modal fusion
CN109886326A (en) A kind of cross-module state information retrieval method, device and storage medium
CN111783474A (en) Comment text viewpoint information processing method and device and storage medium
CN112528637B (en) Text processing model training method, device, computer equipment and storage medium
CN110825867B (en) Similar text recommendation method and device, electronic equipment and storage medium
CN109034203A (en) Training, expression recommended method, device, equipment and the medium of expression recommended models
CN110222328B (en) Method, device and equipment for labeling participles and parts of speech based on neural network and storage medium
CN111259940A (en) Target detection method based on space attention map
CN107861954A (en) Information output method and device based on artificial intelligence
US20210279279A1 (en) Automated graph embedding recommendations based on extracted graph features
CN107463935A (en) Application class methods and applications sorter
CN110046279A (en) Prediction technique, medium, device and the calculating equipment of video file feature
CN110309407A (en) Viewpoint extracting method and device
CN114973086A (en) Video processing method and device, electronic equipment and storage medium
CN110222333A (en) A kind of voice interactive method, device and relevant device
CN116361502B (en) Image retrieval method, device, computer equipment and storage medium
CN115098722B (en) Text and image matching method and device, electronic equipment and storage medium
CN110362808A (en) Text analyzing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200506

Address after: 310052 room 508, floor 5, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: Alibaba (China) Co.,Ltd.

Address before: 100080 Beijing Haidian District city Haidian street A Sinosteel International Plaza No. 8 block 5 layer A, C

Applicant before: Youku network technology (Beijing) Co., Ltd

CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 554, 5 / F, building 3, 969 Wenyi West Road, Wuchang Street, Yuhang District, Hangzhou City, Zhejiang Province

Applicant after: Alibaba (China) Co.,Ltd.

Address before: 310052 room 508, 5th floor, building 4, No. 699 Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant before: Alibaba (China) Co.,Ltd.

GR01 Patent grant
GR01 Patent grant