CN105630768B

CN105630768B - A kind of product name recognition method and device based on stacking condition random field

Info

Publication number: CN105630768B
Application number: CN201510974820.5A
Authority: CN
Inventors: 黄河燕; 杨献祥
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2015-12-23
Filing date: 2015-12-23
Publication date: 2018-10-12
Anticipated expiration: 2035-12-23
Also published as: CN105630768A

Abstract

The present invention relates to a kind of context-sensitive product name recognition methods and device based on stacking condition random field, belong to internet data processing and analysis technical field, the method of the present invention carries out the expression of word using term vector method, and the semantic similarity of the measuring similarity word using vector, pass through the method amalgamation of global contextual information of term vector combination term clustering；It is complicated there is a problem of nesting for ProductName simultaneously, the identification of ProductName is carried out using stacking conditional random field models.Compare the prior art, contextual information is insufficient in effective solution of the present invention ProductName identification, the problems such as ProductName is complicated there are nested structure improves the performance of the ProductName identification of labyrinth, and the accuracy rate of ProductName of the present invention identification and F1 values are higher than conventional method.

Description

A kind of product name recognition method and device based on stacking condition random field

Technical field

The invention belongs to internet data processing and analysis fields, are related to a kind of context based on stacking condition random field Related product name recognition method and device.

Background technology

In the Web2.0 epoch, with the rise of the social network-i i-platforms such as microblogging, each Internet user is no longer only information Viewer, while also become information publisher, internet is changed into interaction from an information promulgating platform Platform.In past ten years, the e-commerce industry in China, which continues to develop, grows, and more and more companies open on the internet Exhibition industry is engaged in, and product is sold by the network promotion.By in December, 2013, enterprise's ratio that online sales are carried out in the whole nation reaches 23.5%, the enterprise that promotion is carried out by internet has also accounted for 20.9%.More and more people are accustomed to shopping at network, The product for interconnecting discussion online oneself purchase comments on oneself product that is used and buying in the place such as forum, microblogging, shopping website The advantages of and deficiency, people be accustomed to checking the user's evaluations of the commodity oneself to be bought by search engine before buying commodity, And online friend can then influence the purchase decision of oneself to the evaluation of a certain commodity quality.All kinds of enterprises have also opened the official of oneself one after another Fang Weibo starts to promote the product of oneself in this new media in microblogging.Not only government begins to focus on internet at present Topic is propagated, and various commercial enterprise also begins to pay close attention to and analyze the network informations such as all kinds of forums, microblogging, blog, it is desirable to Cong Zhongzhang The market public praise for holding Products understands opinions and suggestions of the numerous netizens to Products, and the moment monitors our company's product Negative reviews protect the reputation of company timely to carry out Crisis.Internet has become the companies of all trades and professions from public affairs The important way that approach obtains industry competitive intelligence is opened, all kinds of companies are all in the market table for the product for paying close attention to rival Existing, new product publication to making suitable decision in time.For all kinds of enterprises, concern internet information most it is basic just It is to pay close attention to the product of its affiliated industry and oneself production, therefore ProductName is accurately identified from the data of magnanimity on internet Be carry out industry public sentiment monitoring, Praise and business intelligence basis and premise.

ProductName knowledge maybe identify name of product entity in the text, and ProductName identification is proprietary in information extraction One subdivision field of noun identification, it is intended to will indicate that the Entity recognition of name of product comes out to be business intelligence etc. in text Upper layer application provides support.At present for the research of proper noun recognition mainly for tradition such as name, place name, institution terms Name entity be identified, as the development of internet and e-commerce is also increasingly heavier for the identification work of name of product It wants, it is also relatively fewer about the identification of name of product at present.Different from tradition name entity, the usual structure of ProductName is more multiple It is miscellaneous, number, letter, spcial character, Chinese character etc. are generally comprised, and the relatively long nesting phenomenon of length is than more serious；In addition, It is flooded with a large amount of user on the epoch internets Web2.0 and generates text, since the Literal Skills of user are different, communicative habits are each For its different intractability far above the traditional media such as news, application value is also higher than traditional news media more.In order to more ProductName is accurately identified in good slave internet mass information, needs to consider part and global context information, it is right ProductName is known method for distinguishing and is improved.

Invention content

It is an object of the invention to consider the nested problem of ProductName emphatically, while comprehensively utilizing contextual information and being produced The improvement of name of an article recognition methods proposes a kind of context-sensitive product name recognition method based on stacking condition random field, effectively Solve the problems, such as present in ProductName nested, while making full use of part and global context information to carry out changing for feature Into the performance of promotion ProductName identification.

Idea of the invention is that using term vector model and term clustering amalgamation of global contextual information, local context is supplemented The deficiency of information, while carrying out the identification with the ProductName of nested structure using stacking conditional random field models.

The purpose of the present invention is what is be achieved through the following technical solutions：

A kind of context-sensitive product name recognition method based on stacking condition random field, includes the following steps：

Step 1：Participle and part-of-speech tagging pretreatment are carried out to language material text；

Step 2：Character representation is carried out as unit of word to language material text；

Step 3：The feature templates required with the low layer conditional random field models trained current word utilize after indicating Trained low layer conditional random field models are identified to obtain preliminary recognition result, are denoted as label 1；

Step 4：The word for using a character representation is indicated plus label 1 as quadratic character；

Step 5：The feature templates required with the low layer conditional random field models trained current word utilize after indicating Trained high-rise conditional random field models, which are identified, obtains final recognition result, is denoted as label 2；

Step 6：It is exported after the word for being identified as product entity in language material text is increased its corresponding label.

Preferably, a feature includes foundation characteristic, domain features, category feature, the foundation characteristic is used for Indicate word possessed by feature, including word itself, part of speech, whether comprising letter, whether comprising number, whether include special word Symbol；Domain features are used to indicate the features of word fields, including current word whether brand name, whether serial name, whether model Name, whether product attribute；Category feature is used to indicate the category feature belonging to word.

Preferably, the domain features are determined based on field product knowledge database by string matching mode, the neck Domain product knowledge database is built by following procedure：

Product-related data is captured from field related web site；

The data grabbed are parsed to obtain preliminary product entity list；

Artificial correction is carried out to preliminary product entity list, specifies the affiliated brand of product entity, series and model, structure The product entity list built including product entity and its affiliated brand, series and model simultaneously stores；

With reference to the common properties list for capturing obtained one field product of data manual sorting and store.

Preferably, the category feature belonging to the current word is determined by following procedure：

Word-based vector model clusters similarity of the root therein between, and two words A and B are corresponded to Term vectorAnd vectorBetween similarity be calculated by the following formula：

One unique class number is set for each classification after the completion of cluster；

The class number of classification belonging to current word is exported.

Preferably, the term vector model is obtained by following procedure：

The relevant webpage of downloading field is simultaneously parsed into plain text；

Word segmentation processing is carried out to the text that download obtains；

Use the text training term vector model of point good word.

Preferably, the label 1 and label 2 are labeled using BIO modes, B presentation-entity starts, I presentation-entity In part in addition to beginning, O indicates that non-physical part, the label 1 that thus mode obtains are one of the following：

B-BRA：Indicate the start element of brand name；

I-BRA：Indicate other elements in addition to start element of brand name；

B-SER：Indicate the start element of serial name；

I-SER：Indicate other elements in addition to start element of serial name；

B-TYP：Indicate the start element of model name；

I-TYP：Indicate other elements in addition to start element of model name；

B-COM：Indicate the start element of company name；

I-COM：Indicate other elements in addition to start element of company name；

B-PRO：Indicate the start element of ProductName；

I-PRO：Indicate other elements in addition to start element of ProductName；

O：Indicate non-physical element.

Preferably, the low layer conditional random field models trained and high-rise conditional random field models pass through following mistake Journey obtains：

The relevant text of product is collected as training corpus；

Participle and part-of-speech tagging are carried out to training corpus；

The entities such as brand, series, model, company, the ProductName occurred in the text after label participle obtain including product The sentence of entity；

A feature, label 1 and label 2 are carried out to product entity to indicate；

The product entity indicated with a feature, label 1 has been trained for the training of conditional random field models Low layer conditional random field models, the feature that feature templates should be including a upper word, current word and next word；

Training by the product entity indicated with a feature, label 1, label 2 for conditional random field models obtains Trained high-rise conditional random field models, the feature that feature templates should be including a upper word, current word and next word.

A kind of context-sensitive product name recognition device based on stacking condition random field, including field product knowledge database, Term vector model, the low layer conditional random field models trained, the high-rise conditional random field models trained, Text Pretreatment mould Block, a character representation module, quadratic character representation module, preliminary product name identification module, final products name identification module and Recognition result output module；Text Pretreatment module, a character representation module, preliminary product name identification module, quadratic character Representation module, final products name identification module and recognition result output module are sequentially connected, field product knowledge database, term vector mould Type is connected with a character representation module respectively, the low layer conditional random field models trained and preliminary product name identification module phase Even, the high-rise conditional random field models trained are connected with final products name identification module；

The field product knowledge database is to be built according to the process for building field product knowledge database described in claim 3, packet Include product entity list and common properties list；

The term vector model is to be obtained according to the process of training term vector model described in claim 5；

The low layer conditional random field models trained and the high-rise conditional random field models trained are wanted according to right 7 processes are asked to obtain；

The Text Pretreatment module is used to receive the text of ProductName to be identified and carries out participle and part of speech mark to it Note；

All words and its part of speech that character representation module is used to obtain Text Pretreatment module are produced based on field Product knowledge base and term vector model respectively obtain its characteristic value, i.e., are indicated word with a feature；

All words and its a feature that preliminary product name identification module is used to export a character representation module, melt It is identified, is obtained just by the low layer conditional random field models trained after closing a feature of its previous word and latter word Walk recognition result label 1；

A feature and mark for all words that quadratic character representation module is used to export preliminary product name identification module The quadratic character that 1 combination of note obtains equivalent indicates；

All words and its quadratic character that final products name identification module is used to export quadratic character representation module, melt It is identified, is obtained most by the high-rise conditional random field models trained after closing the quadratic character of its previous word and latter word Whole recognition result label 2；

All words and its label 2 that recognition result output module is used to export final products name identification module, filter out Obtain recognition result list after non-product name entity elements, in recognition result list word and its label replace input text in Equivalent after export.

Preferably, last word content is supplemented in the field product knowledge database regular replenishment field, institute's predicate The newest related text of vector model regular replenishment trains the process of term vector model to be instructed according to claim 5 again Practice.

Preferably, a Sub-eigenvaluc uses power according to character representation described in claim 2, the label 1 and label 2 Profit requires 6 modes to be labeled.

Advantageous effect

The problems such as present invention is complicated for name of product, while contextual information is underutilized, using word The method amalgamation of global contextual information of vector, and asked using the identification of stacking condition random field solution complex structure product name Topic compares the prior art, and contextual information is insufficient in the ProductName identification of this method effective solution, and ProductName has nested tie The problems such as structure is complicated improves the performance of the ProductName identification of labyrinth.The accuracy rate and F1 values of the method for the present invention are higher than biography The method of system,.The present invention is widely used in the ProductName identification of news, microblogging, forum and other social medias.

Description of the drawings

Fig. 1 is a kind of place of the context-sensitive product name recognition method based on stacking condition random field of the embodiment of the present invention Manage flow diagram.

Fig. 2 is a kind of group of the context-sensitive product name recognition device based on stacking condition random field of the embodiment of the present invention At structural schematic diagram.

Specific implementation mode

In order to keep the object, technical solutions and advantages of the present invention etc. of greater clarity, below in conjunction with specific embodiment pair The present invention and its principle are described further, and specific embodiments described below is only used for carrying out necessary explanation to the present invention Illustrate, is not intended to limit the present invention.

It is hereinafter a kind of to the present invention based on the upper and lower of stacking condition random field by taking the identification of the ProductName of field of mobile phones as an example Literary Related product name recognition method illustrates, and is as shown in Figure 1 processing flow schematic diagram, specifically includes following steps：

Step 1：It artificially collects the relevant text of ProductName and identifies language material as ProductName；

Step 2：Collection field relevant ProductName information architecture field product knowledge database；

Step 3：Collect the relevant text training term vector model of product；

Step 4：Feature selecting is carried out, using selected character representation language material；

Step 5：The low layer conditional random field models and identification complex structure product of simple entity for identification are respectively trained The high-rise conditional random field models of name；

Step 6：Using conditional random field models automatic identification name of product.

Each step is described in detail respectively below：

This step is substantially carried out the preparation of language material, for the model training and measure of merit in subsequent step.

Since the present embodiment is by taking the identification of the ProductName of field of mobile phones as an example, this example is completed by following steps：

Step 1-1：From the related web page of field of mobile phones product related web site Zhong Guan-cun download online field of mobile phones, and carry out Parsing only retains the content of text in Web page text；

Step 1-2：Participle and part-of-speech tagging are carried out to obtained text, can be carried out using ICTCLAS 2015；

Step 1-3：The entities such as brand, series, model, company, the ProductName occurred in the text after handmarking's participle, Obtain 4000 sentences for including product entity；

Step 2：The relevant ProductName information in field is collected from internet, builds field product knowledge database；

Product knowledge database is mainly that follow-up step provides field relevant knowledge, needs to use this when carrying out feature selecting The field product knowledge database of step structure.

Since the present embodiment is by taking the identification of the ProductName of field of mobile phones as an example, field product knowledge database includes mainly hand The product in machine field, is completed especially by following steps：

Step 2-1：From Zhong Guan-cun, online mobile phone channel captures mobile phone products related data；

Step 2-2：The data grabbed are parsed to obtain preliminary product entity list, following table arranges for product entity The example of table；

Step 2-3：Artificial correction is carried out to preliminary product entity list, specify the affiliated brand of product entity, series with And model, it builds the product entity list including product entity and its affiliated brand, series and model and stores；Specifically Form is as shown in the table；

Product entity	Brand name	Serial name	Model name
				Samsung Galaxy Note2	Samsung	Galaxy	Note2
Nokia Lumia 920	Nokia	Lumia	920
				Associate S890	Association		S890

Step 2-4：It with reference to the common properties list for capturing obtained one field product of data manual sorting and stores, produces Product attribute list example is as follows:

Step 3：Collect the relevant text training term vector model of product；

Term vector model is mainly used for amalgamation of global contextual information, further supplements contextual information, and it is real to improve product The effect of body identification.

Step 3-1：The a large amount of relevant webpage of mobile phone is captured from the online mobile phone channel in Zhong Guan-cun and mobile phone China website, and It is parsed into plain text, while the relevant microblogging of gripping portion mobile phone from Sina weibo, it is relevant that 1,000,000 mobile phones have been obtained Sentence；

Step 3-2：Word segmentation processing is carried out to the text that download obtains, can be carried out herein using ICTCLAS 2015；

Step 3-3：Using the text training term vector model of point good word, it is used herein as the term vector tool Word2vec that increases income Tool carries out, and setting window size is 10, and vector dimension 100 uses skip-gram models.Obtained after training a word to Model is measured, each word is expressed as the vector of one 100 dimension, can indicate corresponding with the vector of 100 dimensions in follow-up work Word；

This step main purpose is to select feature, and by training data and the unified character representation of test data, selects The feature quality selected directly affects final recognition effect.

Step 4-1：Using current word itself, part of speech, whether comprising letter, whether comprising number, whether include special word Symbol is as basic feature；Wherein current word refers to that handled when handled successively as unit of word the sentence of point good word Word, such as：" I has bought a Samsung mobile phone, the Note2 " of new listing, can be by word processing, when processing arrives " three in processing procedure When star ", current word refers to just " Samsung ", and "one" is a upper word, and " mobile phone " is next word.

Step 4-2：Using the knowledge base obtained in step 2, by current word whether brand name, whether serial name, whether type Number name, whether product attribute etc. is respectively as domain features；

Step 4-3：Using the term vector model obtained in step 3, all words for including in term vector model are used Kmeans algorithms are clustered, and wherein the similarity between word is right using the measuring similarity between the corresponding vector of the word In given vectorAnd vectorDefinitionWithSimilarityIt calculates Formula is as follows：

One unique class number is set for each classification after the completion of cluster, using the classification belonging to current word as class Other feature；

Step 4-4：Feature described in step 4-1 to step 4-3 is used to carry out simple entity in low layer condition random field Identification identifies the entities such as brand name, serial name, model name, Business Name, on the basis of these features, by low layer condition The flag sequence of the random field feature new as one is used for high-rise conditional random field models, carries out the knowledge of complex structure product name Not, ProductName is identified；

Step 4-5：Obtained in step 1 4000 sentences comprising product entity are divided into two parts, 3000 are used as instruction Practice data, 1000 are used as test data, and the feature described in step 4-1 to step 4-4 is indicated respectively, training data and survey It tries the word in data and uses sequence mark shown in sequence and table 2 shown in table 1 respectively, be labeled using BIO modes in flag sequence, B presentation-entity starts, the part in I presentation-entity in addition to beginning, and O indicates non-physical part, B-BRA and I- to be used in this example BRA, B-SER and I-SER, B-TYP and I-TYP, B-COM and I-COM, B-PRO and I-PRO indicate brand name, series respectively Name, model name, company name, the beginning of ProductName and the other elements in addition to beginning indicate non-physical element with O：

Table 1：

Word

1 value of feature

2 value of feature

3 value of feature

……

Feature n values

Flag sequence

Table 2：

Word

1 value of feature

2 value of feature

3 value of feature

……

Feature n values

For the data to be finally identified using sequence mark shown in table 2, last row blank will be by the side in the present invention Method is marked, to reach final identifying purpose.

Step 4-6：Gained sentence in step 1 is carried out characterization expression by the rule defined in step 4-5；

Step 4-7：The identification of ProductName entity and the word before and after product entity have close relationship, therefore defined herein spy The local context information of template fusion is levied, the present embodiment carries out the training and test of conditional random field models using CRF++0.53, this Place only needs the feature templates syntactic definition feature templates according to CRF++, and template item merges a upper word, current word and next The feature of word.

Step 5：The low layer conditional random field models and identification complex structure product of simple entity for identification are respectively trained The high-rise conditional random field models of name；Wherein low layer conditional random field models brand name, serial name, model name, public affairs for identification Take charge of the simple results entity such as name, high-rise conditional random field models ProductName entity for identification.Characterize the training language after indicating Expect that sample is as shown in the table：

Word

Feature 1

Feature 2

…

Feature n

Label 1

Label 2

I

N

Y

O

Like

N

…

N

O

Samsung

Y

N

…

N

B-BRA

B-PRO

Galaxy

N

Y

…

N

B-SER

I-PRO

S3

N

…

Y

B-TYP

I-PRO

。

N

…

N

O

Step 5-1：Step 4-7 institutes are carried out using the training corpus in addition to label 2 for having characterized expression in upper table The conditional random field models of training low layer, the identification of the entity for simple structure after the feature templates stated indicate；

Step 5-2：The feature templates table described in step 4-7 is carried out using the training corpus for having characterized expression in upper table Training high level conditional random field models, are used for the identification of complex structure product name after showing；

Step 6：Using conditional random field models automatic identification name of product；

Step 6-1：Feature defined in step 4 inputs low layer condition random by the data to be identified indicated are characterized Field model carries out the identification of simple entity；The data mode wherein inputted is the data that most next two columns are removed in step 5 sample data, Model can increase " label 1 " column data on the basis of input data and be used as output, at this time can be according to the result of " label 1 " Judge simple entity；

Step 6-2：Output in the recognition result of the low layer condition random field obtained in step 6-1 i.e. step 6-1 is made The identification of complex structure product name is carried out for the input of high-rise conditional random field models；Model can increase on the basis of input data Add " label 2 " column data as output.

Step 6-3：According to the expression meaning for the flag sequence arranged in step 4-5 to the recognition result in step 6-2 into Row parsing, filters out and obtains final ProductName recognition result labeled as O non-physical elements.

A kind of context-sensitive product name recognition device based on stacking condition random field, knot are realized according to the above method Structure is as shown in Fig. 2, the device field product knowledge database, term vector model, the low layer conditional random field models trained, trained High-rise conditional random field models, including Text Pretreatment module, a character representation module, quadratic character representation module, just Walk ProductName identification module, final products name identification module and recognition result output module；Text Pretreatment module, a feature Representation module, preliminary product name identification module, quadratic character representation module, final products name identification module and recognition result output Module is sequentially connected, and field product knowledge database, term vector model are connected with a character representation module respectively, the low layer trained Conditional random field models are connected with preliminary product name identification module, the high-rise conditional random field models trained and final products name Identification module is connected；

The field product knowledge database is to be built according to the process for building field product knowledge database described in claim 3, packet Include product entity list and common properties list；In order to which the last word that can always include field changes, periodically to described The content of last word is supplemented in the product knowledge database of field；

The term vector model is to be obtained according to the process of training term vector model described in claim 5；In order to make word to Amount model can track the newest variation of Field Words always, and regular replenishment field related text re-starts training to it；

All words and its part of speech that character representation module is used to obtain Text Pretreatment module are produced based on field Product knowledge base and term vector model respectively obtain its characteristic value, i.e., are indicated word with a feature, preferably, use is above-mentioned Foundation characteristic, domain features and category feature indicate；

Preferably, the label 1 and label 2 are marked using above-mentioned BIO modes.

Test result

In order to verify effectiveness of the invention, Sina weibo has been captured in the present embodiment from 2 months in April, 2013 in 2012 Totally 7,000 ten thousand microblog datas, 4000 relevant microbloggings of field of mobile phones product of random screening have carried out artificial mark, and adopt Training is done with 3000,1000 are used as test.Contrast experiment uses conditional random field models, using the basis in step 4-1 The feature that feature is tested as a comparison carries out the identification of ProductName entity.The evaluation index of related field includes accuracy rate, recalls Rate, F1 values use evaluation index of the F1 values as this experiment since F1 values are a comprehensive evaluation indexs in this experiment, F1 values are higher, and expression effect is better.The experimental results are shown inthe following table：

As can be seen from the table, the recognition effect of brand name, serial name, model name and product entity, which has, obviously carries It rises, wherein the F1 values of ProductName entity rise the most apparent.Experiment shows that the present invention can effectively improve ProductName Entity recognition Effect.

Claims

1. a kind of context-sensitive product name recognition method based on stacking condition random field, this approach includes the following steps：

Step 1: carrying out participle and part-of-speech tagging pretreatment to language material text；

Step 2: carrying out a character representation as unit of word to language material text；

It has been trained Step 3: the feature templates required with the low layer conditional random field models trained current word utilize after indicating Low layer conditional random field models be identified to obtain preliminary recognition result, be denoted as label 1；

Step 4: the word for using a character representation is indicated plus label 1 as quadratic character；

It has been trained Step 5: the feature templates required with the low layer conditional random field models trained current word utilize after indicating High-rise conditional random field models be identified and obtain final recognition result, be denoted as label 2；

Step 6: being exported after the word for being identified as product entity in language material text is increased its corresponding label 2；

Feature includes foundation characteristic, domain features, category feature, and the foundation characteristic is for indicating possessed by word Feature, including word itself, part of speech, whether comprising letter, whether comprising number, whether include spcial character；Domain features are used for Indicate the feature of word fields, including current word whether brand name, whether serial name, whether model name, whether product attribute； Category feature is used to indicate the category feature belonging to word；

The domain features determine that the field product knowledge database is logical based on field product knowledge database by string matching mode Cross following procedure structure：

Product-related data is captured from field related web site；

The data grabbed are parsed to obtain preliminary product entity list；

Artificial correction is carried out to preliminary product entity list, specifies the affiliated brand of product entity, series and model, structure packet It includes the product entity list including product entity and its affiliated brand, series and model and stores；

With reference to the common properties list for capturing obtained one field product of data manual sorting and store；

Category feature belonging to the current word is determined by following procedure：

Word-based vector model clusters similarity of the root therein between, the corresponding word of two words A and B VectorAnd vectorBetween similarity be calculated by the following formula：

The class number of classification belonging to current word is exported；

The term vector model is obtained by following procedure：

Use the text training term vector model of point good word；

The label 1 and label 2 are labeled using BIO modes, and B presentation-entity starts, in I presentation-entity in addition to beginning Part, O indicate that non-physical part, the label 1 that thus mode obtains are one of the following：

B-BRA：Indicate the start element of brand name；

I-BRA：Indicate other elements in addition to start element of brand name；

B-SER：Indicate the start element of serial name；

I-SER：Indicate other elements in addition to start element of serial name；

B-TYP：Indicate the start element of model name；

I-TYP：Indicate other elements in addition to start element of model name；

B-COM：Indicate the start element of company name；

I-COM：Indicate other elements in addition to start element of company name；

B-PRO：Indicate the start element of ProductName；

I-PRO：Indicate other elements in addition to start element of ProductName；

O：Indicate non-physical element；

The low layer conditional random field models trained and high-rise conditional random field models are obtained by following process：

The relevant text of product is collected as training corpus；

Participle and part-of-speech tagging are carried out to training corpus；

The brand occurred in the text after participle, series, model, company, ProductName entity are marked, is obtained comprising product entity Sentence；

A feature, label 1 and label 2 are carried out to product entity to indicate；

The low layer that the product entity indicated with a feature, label 1 has been trained for the training of conditional random field models Conditional random field models, the feature that feature templates should be including a upper word, current word and next word；

Training by the product entity indicated with a feature, label 1, label 2 for conditional random field models has been trained High-rise conditional random field models, feature templates should include a upper word, current word and next word feature.

2. a kind of context phase based on stacking condition random field of product name recognition method structure according to claim 1 Close product name recognition device, it is characterised in that：Including field product knowledge database, term vector model, the low layer condition trained with Airport model, the high-rise conditional random field models trained, Text Pretreatment module, character representation module, a quadratic character Representation module, preliminary product name identification module, final products name identification module and recognition result output module；Text Pretreatment mould Block, a character representation module, preliminary product name identification module, quadratic character representation module, final products name identification module and Recognition result output module is sequentially connected, and field product knowledge database, term vector model are connected with a character representation module respectively, The low layer conditional random field models trained are connected with preliminary product name identification module, the high-rise conditional random field models trained It is connected with final products name identification module.