CN107992570A - Character string method for digging, device, electronic equipment and computer-readable recording medium - Google Patents
Character string method for digging, device, electronic equipment and computer-readable recording medium Download PDFInfo
- Publication number
- CN107992570A CN107992570A CN201711230875.0A CN201711230875A CN107992570A CN 107992570 A CN107992570 A CN 107992570A CN 201711230875 A CN201711230875 A CN 201711230875A CN 107992570 A CN107992570 A CN 107992570A
- Authority
- CN
- China
- Prior art keywords
- string
- character string
- training
- trained
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
- G06F16/90344—Query processing by using string matching techniques
Abstract
The embodiment of the present disclosure discloses a kind of character string method for digging, device, electronic equipment and computer-readable recording medium, and the character string method for digging includes:Training string data collection is obtained, wherein, the trained string data collection includes training string data and character string characteristic;The trained string data collection is trained, obtains target string judgment models;Target string judgement is carried out to test character string according to the target string judgment models.The disclosure can improve the validity of string segmentation, improve retrieval validity, finally effectively improve the service quality of businessman or service provider, strengthen user experience.
Description
Technical field
This disclosure relates to technical field of information processing, and in particular to a kind of character string method for digging, device, electronic equipment and
Computer-readable recording medium.
Background technology
With the development of Internet technology, more and more businessmans or service provider by internet platform come for
Family provides service, and makes every effort to improve service quality, and strengthens user experience, strives for more user's orders, to lift existing resource
Utilization rate, be that businessman or service provider create more values.But current user is provided using businessman or service
During the retrieval service that business provides, retrieval result hit rate can not meet the requirement of user, so as to weaken user experience.
The content of the invention
The embodiment of the present disclosure provides a kind of character string method for digging, device, electronic equipment and computer-readable recording medium.
In a first aspect, a kind of character string method for digging is provided in the embodiment of the present disclosure.
Specifically, the character string method for digging, including:
Training string data collection is obtained, wherein, the trained string data collection includes training string data and word
Symbol string characteristic;
The trained string data collection is trained, obtains target string judgment models;
Target string judgement is carried out to test character string according to the target string judgment models.
With reference to first aspect, the disclosure is described to obtain training character string number in the first implementation of first aspect
Training string data is obtained according to concentrating, including:
Obtain history string data;
The data of target string will be confirmed as in the history string data as training positive sample;
The data of non-targeted character string will be confirmed as in the history string data as training negative sample;
Based on the trained positive sample and training negative sample generation training string data.
With reference to first aspect, the disclosure is in the first implementation of first aspect, the character string characteristic bag
Include:Word frequency score values of the character string w in default historical time section, the mutual information score value of character string w, the comentropy point of character string w
Whether value, character string w are one or more in preset name.
With reference to first aspect, the disclosure is in the first implementation of first aspect, described pair of trained string data
Collection is trained, and obtains target string judgment models, including:
Feature weight value corresponding with character string characteristic is got based on the trained string data training;
Weighted value generation target string judgment models based on the character string characteristic.
With reference to first aspect, the disclosure is described based on training character string number in the first implementation of first aspect
Feature weight value corresponding with character string characteristic is got according to training, including:
It is trained based on the trained string data collection, obtains feature weight and determine model;
Determine that model determines feature weight value corresponding with the character string characteristic based on the feature weight.
With reference to first aspect, the disclosure is described special based on the character string in the first implementation of first aspect
The weighted value generation target string judgment models of data are levied, including:
The probability calculation model that character string w is target string is generated according to the weighted value of the character string characteristic;
The character string that probability is met to preset condition confirms as target string.
With reference to first aspect, in the first implementation of first aspect, the probability calculation model represents the disclosure
For:
Wherein, fiRepresent the ith feature in character string characteristic, λiRepresent ith feature fiCorresponding weighted value, p
Represent the probable value that character string is target string.
With reference to first aspect, probability is met preset condition by the disclosure in the first implementation of first aspect
Character string confirms as target string, including:
The character string that probability is more than to predetermined probabilities threshold value confirms as target string.
With reference to first aspect, for the disclosure in the first implementation of first aspect, the test character string is default
The character string of input in historical time section.
With reference to first aspect with the first implementation of first aspect, the disclosure is in second of realization side of first aspect
In formula, the method further includes:Predetermined registration operation is performed to the target string.
Second aspect, provides a kind of character string excavating gear in the embodiment of the present disclosure.
Specifically, the character string excavating gear, including:
Acquisition module, is configured as obtaining trained string data collection, wherein, the trained string data collection includes instruction
Practice string data and character string characteristic;
Training module, is configured as being trained the trained string data collection, obtains target string and judge mould
Type;
Judgment module, is configured as carrying out target string to test character string according to the target string judgment models
Judge.
With reference to second aspect, in the first implementation of second aspect, the acquisition module includes the disclosure:
Acquisition submodule, is configured as obtaining history string data;
First confirms submodule, is configured as making the data for confirming as target string in the history string data
For training positive sample;
Second confirms submodule, is configured as that the data of non-targeted character string will be confirmed as in the history string data
As training negative sample;
First generation submodule, is configured as based on the trained positive sample and training negative sample generation training character string number
According to.
With reference to second aspect, the disclosure is in the first implementation of second aspect, the character string characteristic bag
Include:Word frequency score values of the character string w in default historical time section, the mutual information score value of character string w, the comentropy point of character string w
Whether value, character string w are one or more in preset name.
With reference to second aspect, in the first implementation of second aspect, the training module includes the disclosure:
Training submodule, is configured as getting and character string characteristic pair based on the trained string data training
The feature weight value answered;
Second generation submodule, is configured as the generation target string of the weighted value based on the character string characteristic and sentences
Disconnected model.
With reference to second aspect, in the first implementation of second aspect, the trained submodule includes the disclosure:
Training unit, is configured as being trained based on the trained string data collection, obtains feature weight and determine mould
Type;
Determination unit, is configured as determining that model is determined based on the feature weight corresponding with the character string characteristic
Feature weight value.
With reference to second aspect, the disclosure is in the first implementation of second aspect, the second generation submodule bag
Include:
Generation unit, it is target character to be configured as generating character string w according to the weighted value of the character string characteristic
The probability calculation model of string;
Confirmation unit, the character string for being configured as meeting probability preset condition confirm as target string.
With reference to second aspect, in the first implementation of second aspect, the probability calculation model represents the disclosure
For:
Wherein, fiRepresent the ith feature in character string characteristic, λiRepresent ith feature fiCorresponding weighted value, p
Represent the probable value that character string is target string.
With reference to second aspect, in the first implementation of second aspect, the confirmation unit is configured as the disclosure
The character string that probability is more than to predetermined probabilities threshold value confirms as target string.
With reference to second aspect, for the disclosure in the first implementation of second aspect, the test character string is default
The character string of input in historical time section.
With reference to the first of second aspect and second aspect implementation, the disclosure is in second of realization side of second aspect
In formula, described device further includes:Execution module, is configured as performing predetermined registration operation to the target string.
The third aspect, the embodiment of the present disclosure provide a kind of electronic equipment, including memory and processor, the memory
Character string excavating gear is supported to perform the computer of character string method for digging in above-mentioned first aspect for storing one or more
Instruction, the processor are configurable for performing the computer instruction stored in the memory.The character string mining dress
Communication interface can also be included by putting, for character string excavating gear and other equipment or communication.
Fourth aspect, the embodiment of the present disclosure provide a kind of computer-readable recording medium, for storing character string mining
Computer instruction used in device, it includes filled for performing character string method for digging in above-mentioned first aspect for character string mining
Put involved computer instruction.
The technical solution that the embodiment of the present disclosure provides can include the following benefits:
Above-mentioned technical proposal, by considering various characters string feature and distributing corresponding weighted value for each feature, to divide
Analysis determines whether a certain character string is target string, than such as whether for that can be added in segmentation dictionary or retrieval dictionary
New character strings, and then abundant segmentation dictionary or the content for retrieving dictionary, improve the validity of string segmentation, improving retrieval has
Effect property, finally effectively improves the service quality of businessman or service provider, strengthens user experience.
It should be appreciated that the general description and following detailed description of the above are only exemplary and explanatory, not
The disclosure can be limited.
Brief description of the drawings
With reference to attached drawing, by the detailed description of following non-limiting embodiment, the further feature of the disclosure, purpose and excellent
Point will be apparent.In the accompanying drawings:
Fig. 1 shows the flow chart of the character string method for digging according to one embodiment of the disclosure;
Fig. 2 shows the flow chart of the step S101 according to Fig. 1 illustrated embodiments;
Fig. 3 shows the flow chart of the step S102 according to Fig. 1 illustrated embodiments;
Fig. 4 shows the flow chart of the step S301 according to Fig. 3 illustrated embodiments;
Fig. 5 shows the flow chart of the step S302 according to Fig. 3 illustrated embodiments;
Fig. 6 shows the structure diagram of the character string excavating gear according to one embodiment of the disclosure;
Fig. 7 shows the structure diagram of the acquisition module 601 according to Fig. 6 illustrated embodiments;
Fig. 8 shows the structure diagram of the training module 602 according to Fig. 6 illustrated embodiments;
Fig. 9 shows the structure diagram of the training submodule 801 according to Fig. 8 illustrated embodiments;
Figure 10 shows the structure diagram according to the second of Fig. 8 illustrated embodiments the generation submodule 802;
Figure 11 shows the structure diagram of the electronic equipment according to one embodiment of the disclosure;
Figure 12 is adapted for the computer system for realizing the character string method for digging according to one embodiment of the disclosure
Structure diagram.
Embodiment
Hereinafter, the illustrative embodiments of the disclosure will be described in detail with reference to the attached drawings, so that those skilled in the art can
Easily realize them.In addition, for the sake of clarity, the portion unrelated with description illustrative embodiments is eliminated in the accompanying drawings
Point.
In the disclosure, it should be appreciated that the term of " comprising " or " having " etc. is intended to refer to disclosed in this specification
Feature, numeral, step, behavior, component, part or presence of its combination, and be not intended to exclude other one or more features,
Numeral, step, behavior, component, part or its combination there is a possibility that or be added.
It also should be noted that in the case where there is no conflict, the feature in embodiment and embodiment in the disclosure
It can be mutually combined.Describe the disclosure in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
The technical solution that the embodiment of the present disclosure provides, by considering various characters string feature and being distributed for each feature corresponding
Weighted value, to analyze whether definite a certain character string is target string, than such as whether for can be added to segmentation dictionary or
Person retrieves the new character strings in dictionary, and then abundant segmentation dictionary or the content for retrieving dictionary, and that improves string segmentation has
Effect property, improves retrieval validity, finally effectively improves the service quality of businessman or service provider, strengthens user experience.
The character string of disclosed technique scheme can be used for the purpose of retrieving, search for, matching, for the convenience of narration, hereafter
In be described in detail by taking retrieval as an example for disclosed technique scheme.
Fig. 1 shows the flow chart of the character string method for digging according to one embodiment of the disclosure.As shown in Figure 1, the word
Symbol string mining method comprises the following steps S101-S103:
In step S101, training string data collection is obtained, wherein, the trained string data collection includes training word
Accord with string data and character string characteristic;
In step s 102, the trained string data collection is trained, obtains target string judgment models;
In step s 103, target string is carried out to test character string according to the target string judgment models to sentence
It is disconnected.
In view of current user, businessman or service carry when using businessman's retrieval service that either service provider provides
Typically directly obtained for business by character string input by user or after splitting according to general dictionary for character string word or
Person's word is retrieved as retrieval object, since in many cases, these retrieval objects are not present in segmentation dictionary or inspection
In rope dictionary, therefore many noises of naturally occurring in the retrieval result obtained based on the retrieval object, it is impossible to retrieve exactly
User wants the content seen, retrieval result hit rate it is impossible to meet user requirement, so as to reduce businessman or service
The quality of provider's service, weakens user experience.
In this embodiment, propose a kind of character string method for digging, this method by considering various characters string feature, and
Analyze whether definite a certain character string is target string based on training string data, after the target string so obtained
It is continuous to add into segmentation dictionary or retrieval dictionary, and then abundant segmentation dictionary, the content for retrieving dictionary, improve character string
Validity is retrieved, specifically, obtains training string data collection first, wherein, the trained string data collection includes training
String data and character string characteristic, are then trained training string data collection, obtain target string judgement
Model, finally according to target string judgment models to test character string carry out target string judgement, subsequently can by determine
Target string is added into segmentation dictionary or retrieval dictionary, this makes it possible to the validity for improving string segmentation, and then
The hit rate of retrieval result is improved, improves the quality of businessman or service provider service, strengthens user experience.
In an optional implementation of the present embodiment, as shown in Fig. 2, the step S101, i.e., described to obtain training
String data concentrates the step of obtaining training string data, including step S201-S204:
In step s 201, history string data is obtained;
In step S202, the data of target string will be confirmed as in the history string data as the positive sample of training
This;
It is in step S203, the data that non-targeted character string is confirmed as in the history string data are negative as training
Sample;
In step S204, based on the trained positive sample and training negative sample generation training string data.
In this embodiment, more accurate training data in order to obtain, when can obtain default history at random first
Between history string data in section, wherein, the length of the history string data can according to the needs of practical application into
Row is set, for example for the retrieval service based on internet platform, the length of the history string data can be set to 2-5
A character;Then verified for the history string data, will confirm as subsequently adding in history string data
To segmentation dictionary or the data for retrieving target string in dictionary as training positive sample, conversely, by history character string number
Confirm as adding the data to the non-targeted character string in segmentation dictionary or retrieval dictionary in as training negative sample;
Finally training positive sample and training negative sample are combined, form training string data.
In an optional implementation of the present embodiment, the character string characteristic includes:Character string w is gone through default
Whether the word frequency score value in the history period, the mutual information score value of character string w, the comentropy score value of character string w, character string w are pre-
If the one or more in title.
Above-mentioned character string characteristic is that the character string that can improve that the disclosure obtains after test of many times and verification is divided
Cut validity and retrieve the characteristic of validity.
Wherein, word frequency score value fs of the character string w in default historical time section1It is represented by:
f1=log (count (w))
Wherein, count (w) is performed retrieval, inquiry, with the predetermined behaviour of equity for character string w in default historical time section
The number of work.
Word frequency score value fs of the character string w in default historical time section1The character string is more frequent is made for higher characterization
With.
Wherein, the mutual information score value f of character string w2It is represented by:
Wherein, N represents the length of character string w, c1c2 … cNRepresent the character in character string w, p (ci,ci+1) represent ci,
ci+1The probability of two character co-occurrences, p (ci) represent character ciThe probability of appearance.
The character string w mutual information score values f2The tight ness rating of character is bigger inside higher characterization character string.
Wherein, the comentropy score value f of character string w3It is represented by:
f3=Hleft(w)+Hright(w)
Wherein, Hleft(w) left comentropy, H are representedright(w) right comentropy is represented, A represents left neighbours' word set of character string w
Close, right neighbours' set of words of B expression character strings w, p (aw | w) and p (wb | w) represent that left neighbours' word and right neighbours' word occur respectively
Conditional probability.
The comentropy score value f of the character string w3The exterior flexibility ratio of the higher characterization character string is bigger.
Wherein, whether the character string w is that preset name this feature belongs to artificial knowledge's feature, if a character string
W occurs as preset name, then it is likely to be one and needs to add to the word in dictionary, and the preset name such as can be with
For name of firm, service provider title, name of product or service name etc..Whether the character string w is preset name
This feature can be divided into whether character string w is name of firm or service provider title again, and whether character string w is ProductName
Claim or service name the two features, specifically, the character string w whether be name of firm or service provider title this
One feature can be expressed as:
Whether the character string w is that name of product or service name this feature can be expressed as:
In an optional implementation of the present embodiment, as shown in figure 3, the step S102, i.e., to training character string
Data set is trained, the step of obtaining target string judgment models, including step S301-S302:
In step S301, spy corresponding with character string characteristic is got based on the trained string data training
Levy weighted value;
In step s 302, the weighted value generation target string judgment models based on the character string characteristic.
In this embodiment, it is based on the trained character string first with the method and optimization algorithm of model training
Data set trains to obtain one group of feature weight value corresponding with character string characteristic, and is based further on character string characteristic
Weighted value generation target string judgment models.
Wherein, those skilled in the art can calculate according to the needs of practical application for the method and optimization of model training
Method makes choice, and the disclosure is not especially limited it.
In an optional implementation of the present embodiment, as shown in figure 4, the step S301, i.e., based on the training
The step of string data training gets feature weight value corresponding with character string characteristic, including step S401-S402:
In step S401, it is trained based on the trained string data collection, obtains feature weight and determine model;
In step S402, determine that model determines spy corresponding with the character string characteristic based on the feature weight
Levy weighted value.
Mentioned above, the character string characteristic that the disclosure is considered has many kinds, these characteristics can be used to characterize
Some character string is performed the necessity of a certain predetermined registration operation, for example adds to necessity in segmentation dictionary or retrieval dictionary
Property.But the contribution that above-mentioned character string characteristic judges for necessity is different, that is to say, that using above-mentioned more
When kind characteristic one character string of characterization is performed the necessity of predetermined registration operation, the weight of different characteristic does not answer one to treat as
Benevolence, but should distinguish and treat.
Therefore, in this embodiment, determined using the mode of training pattern polyoptimal algorithm for above-mentioned a variety of
Optimal weight distribution for characteristic, such as, the trained string data collection can be based on, using a kind of in machine learning
Simple Logic Regression Models efficient, that practical application is very extensive obtain a feature weight and determine model, further use this
A feature weight determines that model obtains feature weight value corresponding with above-mentioned various features data, usually optimal by being superimposed
One group of optimal feature weight value corresponding with the various features data can be obtained by changing algorithm.
In an optional implementation of the present embodiment, as shown in figure 5, the step S302, i.e., based on the character
The step of weighted value generation target string judgment models of string characteristic, including step S501-S502:
In step S501, it is target string to generate character string w according to the weighted value of the character string characteristic
Probability calculation model;
In step S502, the character string that probability is met to preset condition confirms as target string.
In this embodiment, the target string judgment models may include the probability that character string w is target string
Computation model and carry out whether character string w is part that target string judges according to obtained probable value, specifically, first
The probability calculation model that character string w is target string, institute are generated according to the weighted value of the character string characteristic obtained above
Stating probability calculation model can be expressed as:
Wherein, fiRepresent the ith feature in character string characteristic, λiRepresent ith feature fiCorresponding weighted value, p
Represent the probable value that character string w is target string.Then the character string is judged further according to the corresponding probable values of a certain character string w
Whether w is target string, such as, probable value may be considered character string more than the character string of predetermined probabilities threshold value and compare in itself
Important, information content is bigger, and for retrieving, inquiring about, with the relatively effective character string of reciprocity predetermined registration operation, then can
The character string that probable value is more than to predetermined probabilities threshold value confirms as target string, and such character string is added to segmentation dictionary
Or in retrieval dictionary, the validity of string segmentation can be improved, and then improve the hit rate of retrieval result.
Wherein, the probability threshold value can be configured according to the needs of practical application, and the disclosure does not make its specific value
It is specific to limit.
In an optional implementation of the present embodiment, the test character string is input in default historical time section
Character string, the test character string is similar with training character string, its length can be set according to the needs of practical application, such as
It can be set to 2-5.
In an optional implementation of the present embodiment, the method is further included to perform the target string and preset
The step of operation, wherein, the predetermined registration operation includes:Add into the dictionary files such as segmentation dictionary, retrieval dictionary, perform inspection
Rope, perform search, perform inquiry, perform matching in one or more.
When the predetermined registration operation, to add, extremely segmentation dictionary, retrieval dictionary are when in dictionary file, the one of the present embodiment
In a optional implementation, the method, which further includes, judges that the test character string whether there is the step in dictionary file,
The step is primarily to determine the follow-up judgement for whether being necessary to carry out target string, that is to say, that only for not existing
Test character string in dictionary file carries out the judgement of target string.
Following is embodiment of the present disclosure, can be used for performing embodiments of the present disclosure.
Fig. 6 shows the structure diagram of the character string excavating gear according to one embodiment of the disclosure, which can pass through
Software, hardware or both are implemented in combination with as some or all of of electronic equipment.As shown in fig. 6, the character string is dug
Pick device includes:
Acquisition module 601, is configured as obtaining trained string data collection, wherein, the trained string data Ji Bao
Include trained string data and character string characteristic;
Training module 602, is configured as being trained the trained string data collection, obtains target string judgement
Model;
Judgment module 603, is configured as carrying out target word to test character string according to the target string judgment models
Symbol string judges.
In view of current user, businessman or service carry when using businessman's retrieval service that either service provider provides
Typically directly obtained for business by character string input by user or after splitting according to general dictionary for character string word or
Person's word is retrieved as retrieval object, since in many cases, these retrieval objects are not present in segmentation dictionary or inspection
In rope dictionary, therefore many noises of naturally occurring in the retrieval result obtained based on the retrieval object, it is impossible to retrieve exactly
User wants the content seen, retrieval result hit rate it is impossible to meet user requirement, so as to reduce businessman or service
The quality of provider's service, weakens user experience.
In this embodiment, propose a kind of character string excavating gear, the device by considering various characters string feature, and
Analyze whether definite a certain character string is target string based on training string data, after the target string so obtained
It is continuous to add into segmentation dictionary or retrieval dictionary, and then abundant segmentation dictionary, the content for retrieving dictionary, improve character string
Validity is retrieved, specifically, training string data collection is obtained by acquisition module 601 first, wherein, the trained character string
Data set includes training string data and character string characteristic, then by training module 602 to training string data
Collection is trained, and obtains target string judgment models, finally by judgment module 603 according to target string judgment models pair
Test character string and carry out target string judgement, definite target string can be subsequently added to segmentation dictionary or docuterm
In allusion quotation, this makes it possible to the validity for improving string segmentation, and then the hit rate of retrieval result is improved, improve businessman or clothes
The quality of business provider service, strengthens user experience.
In an optional implementation of the present embodiment, as shown in fig. 7, the acquisition module 601 includes:
Acquisition submodule 701, is configured as obtaining history string data;
First confirms submodule 702, is configured as that the number of target string will be confirmed as in the history string data
According to as training positive sample;
Second confirms submodule 703, is configured as that non-targeted character string will be confirmed as in the history string data
Data are as training negative sample;
First generation submodule 704, is configured as based on the trained positive sample and training negative sample generation training character
String data.
In this embodiment, more accurate training data in order to obtain, can by acquisition submodule 701 first with
Machine obtains the history string data in default historical time section, wherein, the length of the history string data can basis
The needs of practical application are configured, such as the retrieval service based on internet platform, the history character string number
According to length can be set to 2-5 character;Then verified for the history string data, first confirms submodule 702
It will confirm as subsequently adding to segmentation dictionary or retrieving the data of the target string in dictionary in history string data
As training positive sample, conversely, the second confirmation submodule 703 will be confirmed as to add to segmented word in history string data
The data of allusion quotation or the non-targeted character string in retrieval dictionary are as training negative sample;Last first generation submodule 704 will instruct
Practice positive sample and training negative sample combines, form training string data.
In an optional implementation of the present embodiment, the character string characteristic includes:Character string w is gone through default
Whether the word frequency score value in the history period, the mutual information score value of character string w, the comentropy score value of character string w, character string w are pre-
If the one or more in title.
Above-mentioned character string characteristic is that the character string that can improve that the disclosure obtains after test of many times and verification is divided
Cut validity and retrieve the characteristic of validity.
Wherein, word frequency score value fs of the character string w in default historical time section1It is represented by:
f1=log (count (w))
Wherein, count (w) is performed retrieval, inquiry, with the predetermined behaviour of equity for character string w in default historical time section
The number of work.
Word frequency score value fs of the character string w in default historical time section1The character string is more frequent is made for higher characterization
With.
Wherein, the mutual information score value f of character string w2It is represented by:
Wherein, N represents the length of character string w, c1c2 … cNRepresent the character in character string w, p (ci,ci+1) represent ci,
ci+1The probability of two character co-occurrences, p (ci) represent character ciThe probability of appearance.
The character string w mutual information score values f2The tight ness rating of character is bigger inside higher characterization character string.
Wherein, the comentropy score value f of character string w3It is represented by:
f3=Hleft(w)+Hright(w)
Wherein, Hleft(w) left comentropy, H are representedright(w) right comentropy is represented, A represents left neighbours' word set of character string w
Close, right neighbours' set of words of B expression character strings w, p (aw | w) and p (wb | w) represent that left neighbours' word and right neighbours' word occur respectively
Conditional probability.
The comentropy score value f of the character string w3The exterior flexibility ratio of the higher characterization character string is bigger.
Wherein, whether the character string w is that preset name this feature belongs to artificial knowledge's feature, if a character string
W occurs as preset name, then it is likely to be one and needs to add to the word in dictionary, and the preset name such as can be with
For name of firm, service provider title, name of product or service name etc..Whether the character string w is preset name
This feature can be divided into whether character string w is name of firm or service provider title again, and whether character string w is ProductName
Claim or service name the two features, specifically, the character string w whether be name of firm or service provider title this
One feature can be expressed as:
Whether the character string w is that name of product or service name this feature can be expressed as:
In an optional implementation of the present embodiment, as shown in figure 8, the training module 602 includes:
Training submodule 801, is configured as getting and character string characteristic based on the trained string data training
According to corresponding feature weight value;
Second generation submodule 802, is configured as the weighted value generation target character based on the character string characteristic
String judgment models.
In this embodiment, the method and optimization algorithm first by training submodule 801 using model training
One group of feature weight value corresponding with character string characteristic is got based on the trained string data training, and further
Target string judgment models are generated by the second weighted value of the generation submodule 802 based on character string characteristic.
Wherein, those skilled in the art can calculate according to the needs of practical application for the method and optimization of model training
Method makes choice, and the disclosure is not especially limited it.
In an optional implementation of the present embodiment, as shown in figure 9, the trained submodule 801 includes:
Training unit 901, is configured as being trained based on the trained string data collection, obtains feature weight and determine
Model;
Determination unit 902, is configured as determining that model determines and the character string characteristic based on the feature weight
Corresponding feature weight value.
Mentioned above, the character string characteristic that the disclosure is considered has many kinds, these characteristics can be used to characterize
Some character string is performed the necessity of a certain predetermined registration operation, for example adds to necessity in segmentation dictionary or retrieval dictionary
Property.But the contribution that above-mentioned character string characteristic judges for necessity is different, that is to say, that using above-mentioned more
When kind characteristic one character string of characterization is performed the necessity of predetermined registration operation, the weight of different characteristic does not answer one to treat as
Benevolence, but should distinguish and treat.
Therefore, in this embodiment, training unit 901 is determined pair using the mode of training pattern polyoptimal algorithm
The optimal weight distribution for above-mentioned various features data, such as, the trained string data collection can be based on, uses machine
A kind of simple Logic Regression Models efficient, that practical application is very extensive obtain a feature weight and determine model in study, into
One step determination unit 902 determines that model obtains feature weight corresponding with above-mentioned various features data using this feature weight
Value, usually can obtain one group corresponding with the various features data optimal feature power by being superimposed optimization algorithm
Weight values.
In an optional implementation of the present embodiment, as shown in Figure 10, the second generation submodule 802 includes:
Generation unit 1001, it is target to be configured as generating character string w according to the weighted value of the character string characteristic
The probability calculation model of character string;
Confirmation unit 1002, the character string for being configured as meeting probability preset condition confirm as target string.
In this embodiment, the target string judgment models may include the probability that character string w is target string
Computation model and carry out whether character string w is part that target string judges according to obtained probable value, specifically, generation
Unit 1001 is first according to the probability that the weighted value of the character string characteristic obtained above generation character string w is target string
Computation model, the probability calculation model can be expressed as:
Wherein, fiRepresent the ith feature in character string characteristic, λiRepresent ith feature fiCorresponding weighted value, p
Represent the probable value that character string w is target string.Then confirmation unit 1002 is further according to the corresponding probable values of a certain character string w
Judge whether character string w is target string, such as, the character string that probable value is more than predetermined probabilities threshold value may be considered word
Symbol string itself is important, and information content is bigger, and for retrieving, inquiring about, with the relatively effective character of reciprocity predetermined registration operation
String, then the character string that probable value can be more than to predetermined probabilities threshold value confirms as target string, by such character string
Add into segmentation dictionary or retrieval dictionary, the validity of string segmentation can be improved, and then improve the life of retrieval result
Middle rate.
Wherein, the probability threshold value can be configured according to the needs of practical application, and the disclosure does not make its specific value
It is specific to limit.
In an optional implementation of the present embodiment, the test character string is input in default historical time section
Character string, the test character string is similar with training character string, its length can be set according to the needs of practical application, such as
It can be set to 2-5.
In an optional implementation of the present embodiment, described device further includes execution module, is configured as to described
Target string performs predetermined registration operation, wherein, the predetermined registration operation includes:Add to dictionary texts such as segmentation dictionary, retrieval dictionaries
In part, perform retrieval, perform search, perform inquiry, perform matching in one or more.
When the predetermined registration operation, to add, extremely segmentation dictionary, retrieval dictionary are when in dictionary file, the one of the present embodiment
In a optional implementation, described device further includes the second judgment module, which can be configured as judging the test character
String whether there is in dictionary file, and the setting of the second judgment module is primarily to determine subsequently whether be necessary to carry out target
The judgement of character string, that is to say, that judgment module carries out target word only for the test character string not existed in dictionary file
Accord with the judgement of string.
The disclosure also discloses a kind of electronic equipment, and Figure 11 shows the knot of the electronic equipment according to one embodiment of the disclosure
Structure block diagram, as shown in figure 11, the electronic equipment 1100 include memory 1101 and processor 1102;Wherein,
The memory 1101 is used to store one or more computer instruction, wherein, one or more computer
Instruction is performed by the processor 1102 to realize:
Training string data collection is obtained, wherein, the trained string data collection includes training string data and word
Symbol string characteristic;
The trained string data collection is trained, obtains target string judgment models;
Target string judgement is carried out to test character string according to the target string judgment models.
One or more computer instruction can be also performed by the processor 1102 to realize:
The training string data that obtains concentrates acquisition training string data, including:
Obtain history string data;
The data of target string will be confirmed as in the history string data as training positive sample;
The data of non-targeted character string will be confirmed as in the history string data as training negative sample;
Based on the trained positive sample and training negative sample generation training string data.
The character string characteristic includes:Word frequency score values of the character string w in default historical time section, character string w's
Whether mutual information score value, the comentropy score value of character string w, character string w are one or more in preset name.
Described pair of trained string data collection is trained, and obtains target string judgment models, including:
Feature weight value corresponding with character string characteristic is got based on the trained string data training;
Weighted value generation target string judgment models based on the character string characteristic.
It is described that feature weight value corresponding with character string characteristic is got based on training string data training, wrap
Include:
It is trained based on the trained string data collection, obtains feature weight and determine model;
Determine that model determines feature weight value corresponding with the character string characteristic based on the feature weight.
The weighted value generation target string judgment models based on the character string characteristic, including:
The probability calculation model that character string w is target string is generated according to the weighted value of the character string characteristic;
The character string that probability is met to preset condition confirms as target string.
The probability calculation model is expressed as:
Wherein, fiRepresent the ith feature in character string characteristic, λiRepresent ith feature fiCorresponding weighted value, p
Represent the probable value that character string is target string.
The character string that probability is met to preset condition confirms as target string, including:
The character string that probability is more than to predetermined probabilities threshold value confirms as target string.
The test character string is the character string of input in default historical time section.
Further include:
Predetermined registration operation is performed to the target string.
Figure 12 is suitable for the structure for being used for realizing the computer system of the character string method for digging according to disclosure embodiment
Schematic diagram.
As shown in figure 12, computer system 1200 includes central processing unit (CPU) 1201, its can according to be stored in only
Read the program in memory (ROM) 1202 or be loaded into from storage part 1208 in random access storage device (RAM) 1203
Program and perform the various processing in the embodiment shown in above-mentioned Fig. 1-5.In RAM1203, also it is stored with system 1200 and grasps
Various programs and data needed for making.CPU1201, ROM1202 and RAM1203 are connected with each other by bus 1204.Input/defeated
Go out (I/O) interface 1205 and be also connected to bus 1204.
I/O interfaces 1205 are connected to lower component:Importation 1206 including keyboard, mouse etc.;Including such as cathode
The output par, c 1207 of ray tube (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage part including hard disk etc.
1208;And the communications portion 1209 of the network interface card including LAN card, modem etc..Communications portion 1209 passes through
Communication process is performed by the network of such as internet.Driver 1210 is also according to needing to be connected to I/O interfaces 1205.It is detachable to be situated between
Matter 1211, such as disk, CD, magneto-optic disk, semiconductor memory etc., are installed on driver 1210 as needed, so as to
Storage part 1208 is mounted into as needed in the computer program read from it.
Especially, according to embodiment of the present disclosure, computer is may be implemented as above with reference to Fig. 1-5 methods described
Software program.For example, embodiment of the present disclosure includes a kind of computer program product, it includes being tangibly embodied in and its can
The computer program on medium is read, the computer program includes the program generation for the character string method for digging for being used to perform Fig. 1-5
Code.In such embodiment, which can be downloaded and installed by communications portion 1209 from network,
And/or it is mounted from detachable media 1211.
Flow chart and block diagram in attached drawing, it is illustrated that according to the system, method and computer of the various embodiments of the disclosure
Architectural framework in the cards, function and the operation of program product.At this point, each square frame in course diagram or block diagram can be with
A part for a module, program segment or code is represented, a part for the module, program segment or code includes one or more
The executable instruction of logic function as defined in being used for realization.It should also be noted that some as replace realization in, institute in square frame
The function of mark can also be with different from the order marked in attached drawing generation.For example, two square frames succeedingly represented are actual
On can perform substantially in parallel, they can also be performed in the opposite order sometimes, this is depending on involved function.Also
It is noted that the combination of each square frame and block diagram in block diagram and/or flow chart and/or the square frame in flow chart, Ke Yiyong
The dedicated hardware based systems of functions or operations as defined in execution is realized, or can be referred to specialized hardware and computer
The combination of order is realized.
Being described in unit or module involved in disclosure embodiment can be realized by way of software, also may be used
Realized in a manner of by hardware.Described unit or module can also be set within a processor, these units or module
Title do not form restriction to the unit or module in itself under certain conditions.
As on the other hand, the disclosure additionally provides a kind of computer-readable recording medium, the computer-readable storage medium
Matter can be computer-readable recording medium included in device described in the above embodiment;Can also be individualism,
Without the computer-readable recording medium in supplying equipment.Computer-readable recording medium storage has one or more than one journey
Sequence, described program is used for performing by one or more than one processor is described in disclosed method.
Above description is only the preferred embodiment of the disclosure and the explanation to institute's application technology principle.People in the art
Member should be appreciated that invention scope involved in the disclosure, however it is not limited to the technology that the particular combination of above-mentioned technical characteristic forms
Scheme, while should also cover in the case where not departing from the inventive concept, carried out by above-mentioned technical characteristic or its equivalent feature
The other technical solutions for being combined and being formed.Such as features described above has similar work(with the (but not limited to) disclosed in the disclosure
The technical solution that the technical characteristic of energy is replaced mutually and formed.
The present disclosure discloses A1, a kind of character string method for digging, the described method includes:Training string data collection is obtained,
Wherein, the trained string data collection includes training string data and character string characteristic;To the trained character string
Data set is trained, and obtains target string judgment models;According to the target string judgment models to testing character string
Carry out target string judgement.A2, the method according to A1, the training string data that obtains concentrate acquisition training character
String data, including:Obtain history string data;The data that target string is confirmed as in the history string data are made
For training positive sample;The data of non-targeted character string will be confirmed as in the history string data as training negative sample;Base
In the trained positive sample and training negative sample generation training string data.A3, the method according to A1, the character string
Characteristic includes:Word frequency score values of the character string w in default historical time section, the mutual information score value of character string w, character string w
Comentropy score value, whether character string w is one or more in preset name.A4, the method according to A1, described pair of instruction
Practice string data collection to be trained, obtain target string judgment models, including:Assembled for training based on the trained string data
Get feature weight value corresponding with character string characteristic;Weighted value generation target based on the character string characteristic
Character string judgment models.A5, the method according to A4, it is described to be got and character string spy based on training string data training
The corresponding feature weight value of data is levied, including:It is trained based on the trained string data collection, obtains feature weight and determine
Model;Determine that model determines feature weight value corresponding with the character string characteristic based on the feature weight.A6, basis
Method described in A4, the weighted value generation target string judgment models based on the character string characteristic, including:Root
Weighted value generation character string w according to the character string characteristic is the probability calculation model of target string;Probability is met
The character string of preset condition confirms as target string.A7, the method according to A6, the probability calculation model are expressed as:
Wherein, fiRepresent the ith feature in character string characteristic, λiRepresent ith feature fiCorresponding weighted value, p
Represent the probable value that character string is target string.Probability, is met the character string of preset condition by A8, the method according to A6
Target string is confirmed as, including:The character string that probability is more than to predetermined probabilities threshold value confirms as target string.A9, basis
Method described in A1, the test character string are the character string of input in default historical time section.A10, the side according to A1
Method, further includes:Predetermined registration operation is performed to the target string.
The present disclosure discloses B11, a kind of character string excavating gear, described device includes:Acquisition module, is configured as obtaining
Training string data collection, wherein, the trained string data collection includes training string data and character string characteristic;
Training module, is configured as being trained the trained string data collection, obtains target string judgment models;Judge mould
Block, is configured as carrying out target string judgement to test character string according to the target string judgment models.B12, basis
Device described in B11, the acquisition module include:Acquisition submodule, is configured as obtaining history string data;First confirms
Submodule, is configured as the data using target string is confirmed as in the history string data as training positive sample;The
Two confirm submodule, are configured as the data that non-targeted character string is confirmed as in the history string data are negative as training
Sample;First generation submodule, is configured as based on the trained positive sample and training negative sample generation training string data.
B13, the device according to B11, the character string characteristic include:Word frequency of the character string w in default historical time section
Score value, the mutual information score value of character string w, the comentropy score value of character string w, character string w whether be one kind in preset name or
It is a variety of.B14, the device according to B11, the training module include:Training submodule, is configured as being based on the trained word
Symbol string data training gets feature weight value corresponding with character string characteristic;Second generation submodule, is configured as base
Target string judgment models are generated in the weighted value of the character string characteristic.B15, the device according to B14, it is described
Training submodule includes:Training unit, is configured as being trained based on the trained string data collection, obtains feature weight
Determine model;Determination unit, is configured as determining that model determines and the character string characteristic pair based on the feature weight
The feature weight value answered.B16, the device according to B14, the second generation submodule include:Generation unit, is configured as
The probability calculation model that character string w is target string is generated according to the weighted value of the character string characteristic;Confirmation unit,
The character string for being configured as meeting probability preset condition confirms as target string.B17, the device according to B16, it is described
Probability calculation model is expressed as:
Wherein, fiRepresent the ith feature in character string characteristic, λiRepresent ith feature fiCorresponding weighted value, p
Represent the probable value that character string is target string.B18, the device according to B16, the confirmation unit is configured as will be general
The character string that rate is more than predetermined probabilities threshold value confirms as target string.B19, the device according to B11, the test character
Go here and there to preset the character string inputted in historical time section.B20, the device according to B11, further include:Execution module, is configured
To perform predetermined registration operation to the target string.
The present disclosure discloses C21, a kind of electronic equipment, including memory and processor;Wherein, the memory is used to deposit
One or more computer instruction is stored up, wherein, one or more computer instruction is performed by the processor to realize such as
A1-A10 any one of them methods.
The disclosure also discloses D22, a kind of computer-readable recording medium, is stored thereon with computer instruction, the calculating
Machine instruction realizes such as A1-A10 any one of them methods when being executed by processor.
Claims (10)
- A kind of 1. character string method for digging, it is characterised in that the described method includes:Training string data collection is obtained, wherein, the trained string data collection includes training string data and character string Characteristic;The trained string data collection is trained, obtains target string judgment models;Target string judgement is carried out to test character string according to the target string judgment models.
- 2. according to the method described in claim 1, it is characterized in that, the training string data that obtains concentrates acquisition training word String data is accorded with, including:Obtain history string data;The data of target string will be confirmed as in the history string data as training positive sample;The data of non-targeted character string will be confirmed as in the history string data as training negative sample;Based on the trained positive sample and training negative sample generation training string data.
- 3. according to the method described in claim 1, it is characterized in that, the character string characteristic includes:Character string w is default Word frequency score value in historical time section, the mutual information score value of character string w, the comentropy score value of character string w, character string w whether be One or more in preset name.
- 4. according to the method described in claim 1, it is characterized in that, described pair of trained string data collection is trained, obtain Target string judgment models, including:Feature weight value corresponding with character string characteristic is got based on the trained string data training;Weighted value generation target string judgment models based on the character string characteristic.
- 5. according to the method described in claim 4, it is characterized in that, described got and word based on training string data training The corresponding feature weight value of symbol string characteristic, including:It is trained based on the trained string data collection, obtains feature weight and determine model;Determine that model determines feature weight value corresponding with the character string characteristic based on the feature weight.
- 6. the according to the method described in claim 4, it is characterized in that, weighted value life based on the character string characteristic Into target string judgment models, including:The probability calculation model that character string w is target string is generated according to the weighted value of the character string characteristic;The character string that probability is met to preset condition confirms as target string.
- 7. according to the method described in claim 6, it is characterized in that, the probability calculation model is expressed as:<mrow> <mi>p</mi> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mn>1</mn> <mo>+</mo> <mi>exp</mi> <mrow> <mo>(</mo> <mo>-</mo> <msub> <mo>&Sigma;</mo> <mi>i</mi> </msub> <msub> <mi>&lambda;</mi> <mi>i</mi> </msub> <msub> <mi>f</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> </mrow>Wherein, fiRepresent the ith feature in character string characteristic, λiRepresent ith feature fiCorresponding weighted value, p are represented Character string is the probable value of target string.
- 8. a kind of character string excavating gear, it is characterised in that described device includes:Acquisition module, is configured as obtaining trained string data collection, wherein, the trained string data collection includes training word Accord with string data and character string characteristic;Training module, is configured as being trained the trained string data collection, obtains target string judgment models;Judgment module, is configured as sentencing test character string progress target string according to the target string judgment models It is disconnected.
- 9. a kind of electronic equipment, it is characterised in that including memory and processor;Wherein,The memory is used to store one or more computer instruction, wherein, one or more computer instruction is by institute Processor is stated to perform to realize such as claim 1-7 any one of them methods.
- 10. a kind of computer-readable recording medium, is stored thereon with computer instruction, it is characterised in that the computer instruction quilt Such as claim 1-7 any one of them methods are realized when processor performs.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711230875.0A CN107992570A (en) | 2017-11-29 | 2017-11-29 | Character string method for digging, device, electronic equipment and computer-readable recording medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711230875.0A CN107992570A (en) | 2017-11-29 | 2017-11-29 | Character string method for digging, device, electronic equipment and computer-readable recording medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107992570A true CN107992570A (en) | 2018-05-04 |
Family
ID=62034309
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711230875.0A Pending CN107992570A (en) | 2017-11-29 | 2017-11-29 | Character string method for digging, device, electronic equipment and computer-readable recording medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107992570A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108933781A (en) * | 2018-06-19 | 2018-12-04 | 上海点融信息科技有限责任公司 | Method, apparatus and computer readable storage medium for processing character string |
CN112629821A (en) * | 2020-11-17 | 2021-04-09 | 中国移动通信集团江苏有限公司 | Optical cable position determining method and device, electronic equipment and storage medium |
CN113361238A (en) * | 2021-05-21 | 2021-09-07 | 北京语言大学 | Method and device for automatically proposing question by recombining question types with language blocks |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102043845A (en) * | 2010-12-08 | 2011-05-04 | 百度在线网络技术(北京)有限公司 | Method and equipment for extracting core keywords based on query sequence cluster |
US8326861B1 (en) * | 2010-06-23 | 2012-12-04 | Google Inc. | Personalized term importance evaluation in queries |
CN104866496A (en) * | 2014-02-22 | 2015-08-26 | 腾讯科技(深圳)有限公司 | Method and device for determining morpheme significance analysis model |
CN104978356A (en) * | 2014-04-10 | 2015-10-14 | 阿里巴巴集团控股有限公司 | Synonym identification method and device |
CN106649666A (en) * | 2016-11-30 | 2017-05-10 | 浪潮电子信息产业股份有限公司 | Left-right recursion-based new word discovery method |
-
2017
- 2017-11-29 CN CN201711230875.0A patent/CN107992570A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8326861B1 (en) * | 2010-06-23 | 2012-12-04 | Google Inc. | Personalized term importance evaluation in queries |
CN102043845A (en) * | 2010-12-08 | 2011-05-04 | 百度在线网络技术(北京)有限公司 | Method and equipment for extracting core keywords based on query sequence cluster |
CN104866496A (en) * | 2014-02-22 | 2015-08-26 | 腾讯科技(深圳)有限公司 | Method and device for determining morpheme significance analysis model |
CN104978356A (en) * | 2014-04-10 | 2015-10-14 | 阿里巴巴集团控股有限公司 | Synonym identification method and device |
CN106649666A (en) * | 2016-11-30 | 2017-05-10 | 浪潮电子信息产业股份有限公司 | Left-right recursion-based new word discovery method |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108933781A (en) * | 2018-06-19 | 2018-12-04 | 上海点融信息科技有限责任公司 | Method, apparatus and computer readable storage medium for processing character string |
CN108933781B (en) * | 2018-06-19 | 2021-07-02 | 上海点融信息科技有限责任公司 | Method, apparatus and computer-readable storage medium for processing character string |
CN112629821A (en) * | 2020-11-17 | 2021-04-09 | 中国移动通信集团江苏有限公司 | Optical cable position determining method and device, electronic equipment and storage medium |
CN112629821B (en) * | 2020-11-17 | 2023-10-27 | 中国移动通信集团江苏有限公司 | Method and device for determining optical cable position, electronic equipment and storage medium |
CN113361238A (en) * | 2021-05-21 | 2021-09-07 | 北京语言大学 | Method and device for automatically proposing question by recombining question types with language blocks |
CN113361238B (en) * | 2021-05-21 | 2022-02-11 | 北京语言大学 | Method and device for automatically proposing question by recombining question types with language blocks |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109783817B (en) | Text semantic similarity calculation model based on deep reinforcement learning | |
CN111444320B (en) | Text retrieval method and device, computer equipment and storage medium | |
Sun et al. | A new surrogate-assisted interactive genetic algorithm with weighted semisupervised learning | |
CN108363790A (en) | For the method, apparatus, equipment and storage medium to being assessed | |
CN107066449A (en) | Information-pushing method and device | |
CN107330115A (en) | A kind of information recommendation method and device | |
CN105808590B (en) | Search engine implementation method, searching method and device | |
CN109271493A (en) | A kind of language text processing method, device and storage medium | |
CN107870964A (en) | A kind of sentence sort method and system applied to answer emerging system | |
CN107832305A (en) | Method and apparatus for generating information | |
US11347995B2 (en) | Neural architecture search with weight sharing | |
WO2023065859A1 (en) | Item recommendation method and apparatus, and storage medium | |
CN108804526A (en) | Interest determines that system, interest determine method and storage medium | |
CN108154198A (en) | Knowledge base entity normalizing method, system, terminal and computer readable storage medium | |
CN111353033B (en) | Method and system for training text similarity model | |
CN107992570A (en) | Character string method for digging, device, electronic equipment and computer-readable recording medium | |
CN109388715A (en) | The analysis method and device of user data | |
CN107369052A (en) | User's registration behavior prediction method, apparatus and electronic equipment | |
CN109903095A (en) | Data processing method, device, electronic equipment and computer readable storage medium | |
CN106445915A (en) | New word discovery method and device | |
CN108255706A (en) | Edit methods, device, terminal device and the storage medium of automatic test script | |
Krantsevich et al. | Stochastic tree ensembles for estimating heterogeneous effects | |
CN111831898A (en) | Sorting method and device, electronic equipment and readable storage medium | |
CN110046344A (en) | Add the method and terminal device of separator | |
CN107885879A (en) | Semantic analysis, device, electronic equipment and computer-readable recording medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180504 |