CN110457476A - Method and apparatus for generating disaggregated model - Google Patents
Method and apparatus for generating disaggregated model Download PDFInfo
- Publication number
- CN110457476A CN110457476A CN201910721353.3A CN201910721353A CN110457476A CN 110457476 A CN110457476 A CN 110457476A CN 201910721353 A CN201910721353 A CN 201910721353A CN 110457476 A CN110457476 A CN 110457476A
- Authority
- CN
- China
- Prior art keywords
- model
- text
- hyper parameter
- training
- initial
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Embodiment of the disclosure discloses the method and apparatus for generating disaggregated model.One specific embodiment of this method includes: to obtain training sample set, wherein training sample includes sample text and sample class corresponding with sample text;Determine the statistical nature of training sample set, wherein statistical nature includes the form feature for characterizing text length;Based on statistical nature and to the training of initial model, textual classification model is generated, wherein textual classification model is used to characterize the corresponding relationship between text categories and text, and initial model is chosen from preset pre-training model set.The embodiment realizes automatically generating for the textual classification model based on cloud computing technology, carries out adjusting ginseng by hand without user.
Description
Technical field
Embodiment of the disclosure is related to field of computer technology, and in particular to for generating the method and dress of disaggregated model
It sets.
Background technique
With the rapid development of artificial intelligence technology (Artificial Intelligence, AI) and Internet technology, face
To the mass text information of rapid growth, how effectively to carry out classification to text is subsequent lookup content, extracts information value
Important prerequisite.
Relevant mode be usually there are two types of: one is using Feature Engineering technology extract text feature, further according to extraction
Feature between similarity degree classify to text.The second is using trained automation textual classification model come to text
This is classified.
Summary of the invention
Embodiment of the disclosure proposes the method and apparatus for generating disaggregated model.
In a first aspect, embodiment of the disclosure provides a kind of method for generating disaggregated model, this method comprises: obtaining
Take training sample set, wherein training sample includes sample text and sample class corresponding with sample text;Determine training sample
The statistical nature of this set, wherein statistical nature includes the form feature for characterizing text length;Based on statistical nature and to initial
The training of model generates textual classification model, wherein textual classification model is corresponding between text categories and text for characterizing
Relationship, initial model are chosen from preset pre-training model set.
In some embodiments, above-mentioned to be based on statistical nature and initial model, generate textual classification model, comprising: by shape
Formula feature is input to hyper parameter trained in advance and generates model, obtains hyper parameter group corresponding with form feature, wherein hyper parameter
It include model hyper parameter and training hyper parameter in group;From selection in pre-training model set and the model hyper parameter in hyper parameter group
Matched pre-training model is as initial model;Initial model is trained according to the training hyper parameter in hyper parameter group, it is raw
At textual classification model.
In some embodiments, above-mentioned to be based on statistical nature and initial model, generate textual classification model, comprising: obtain
Initial hyper parameter group set, wherein include initial model hyper parameter and initial training hyper parameter in initial hyper parameter group;From initial
Choose initial hyper parameter group in hyper parameter group set, and execute step identified below: chosen from pre-training model set with
The matched pre-training model of initial model hyper parameter in selected initial hyper parameter group is as initial model;According to selected
Initial hyper parameter group in initial training hyper parameter selected initial model is trained, generate initial hyper parameter group pair
The quasi- textual classification model answered;Quasi- textual classification model generated is evaluated and tested based on verifying text collection, generates evaluation and test
As a result;Meet hyper parameter group in response to determination evaluation result generated and determine condition, determines condition from hyper parameter group is met
Textual classification model is determined in the corresponding quasi- textual classification model of evaluation result;It is discontented in response to determination evaluation result generated
Sufficient hyper parameter group determines condition, is updated to the initial hyper parameter group in initial hyper parameter group set;From updated initial
Initial hyper parameter group is chosen in hyper parameter group set, continues to execute determining step.
In some embodiments, above-mentioned statistical nature further includes the content characteristic for characterizing content of text;And it is above-mentioned from pre-
It chooses in training pattern set and makees with the matched pre-training model of initial model hyper parameter in selected initial hyper parameter group
For initial model, comprising: from pre-training model set choose with content characteristic and selected initial hyper parameter group in just
The matched pre-training model of beginning model hyper parameter is as initial model, wherein pre-training model is corresponding with semantic label.
In some embodiments, the initial training hyper parameter in the above-mentioned initial hyper parameter group according to selected by is to selected
Initial model be trained, generate the corresponding quasi- textual classification model of initial hyper parameter group, comprising: from training sample set
Choose training sample, and execute following training step: by the sample text of the training sample of selection be input to it is selected just
Beginning model generates text categories;According to text categories generated sample class corresponding with the sample text of input, determination is poor
Different value;Determine whether difference value meets trained completion condition, wherein difference value and training completion condition are based on selected super ginseng
Initial training hyper parameter in array determines;Meet training completion condition in response to determining, selected initial model is determined
For the selected corresponding quasi- textual classification model of hyper parameter group;It is unsatisfactory for training completion condition in response to determining, selected by adjustment
The relevant parameter of the initial model taken, and training sample is chosen from training sample set, use initial model adjusted
As selected initial model, training step is continued to execute.
In some embodiments, above-mentioned acquisition training sample set, comprising: receive the mark text set that user terminal is sent
It closes, wherein mark text includes text and text categories markup information corresponding with text;Mark text collection is drawn
Point, generate training sample set and verifying text collection, wherein training sample include as sample text text and as with
The text categories markup information of the corresponding sample class of sample text.
In some embodiments, this method further include: receive the text collection to be sorted that user terminal is sent;By text to be sorted
This set is input to textual classification model, generates the corresponding classification information of text to be sorted in text collection to be sorted, wherein
For characterizing classification belonging to text to be sorted, classification information matches classification information with sample class;By classification generated
Information is sent to user terminal with corresponding text information to be sorted, wherein text information to be sorted is for identifying text to be sorted
Text to be sorted in set.
Second aspect, embodiment of the disclosure provide a kind of for generating the device of disaggregated model, which includes: to obtain
Unit is taken, is configured to obtain training sample set, wherein training sample includes sample text and sample corresponding with sample text
This classification;Determination unit is configured to determine the statistical nature of training sample set, wherein statistical nature includes characterization text
The form feature of length;First generation unit is configured to generate text point based on statistical nature and to the training of initial model
Class model, wherein textual classification model is used to characterize corresponding relationship between text categories and text, and initial model is from preset
It is chosen in pre-training model set.
In some embodiments, above-mentioned first generation unit includes: the first generation module, is configured to form feature is defeated
Enter to hyper parameter trained in advance and generate model, obtains hyper parameter group corresponding with form feature, wherein include in hyper parameter group
Model hyper parameter and training hyper parameter;Choose module, be configured to from pre-training model set choose with hyper parameter group in
The matched pre-training model of model hyper parameter is as initial model;Second generation module is configured to according in hyper parameter group
Training hyper parameter is trained initial model, generates textual classification model.
In some embodiments, above-mentioned first generation unit includes: acquisition module, is configured to obtain initial hyper parameter group
Set, wherein include initial model hyper parameter and initial training hyper parameter in initial hyper parameter group;Determining module is configured to
Initial hyper parameter group is chosen from initial hyper parameter group set, and executes step identified below: from pre-training model set
It chooses with the matched pre-training model of initial model hyper parameter in selected initial hyper parameter group as initial model;According to
Initial training hyper parameter in selected initial hyper parameter group is trained selected initial model, generates initial super ginseng
The corresponding quasi- textual classification model of array;Quasi- textual classification model generated is evaluated and tested based on verifying text collection, it is raw
At evaluation result;Meet hyper parameter group in response to determination evaluation result generated and determine condition, is determined from hyper parameter group is met
Textual classification model is determined in the corresponding quasi- textual classification model of the evaluation result of condition;Update module is configured in response to
Determine that evaluation result generated is unsatisfactory for hyper parameter group and determines condition, to the initial hyper parameter group in initial hyper parameter group set
It is updated;Initial hyper parameter group is chosen from updated initial hyper parameter group set, continues to execute determining step.
In some embodiments, above-mentioned statistical nature further includes the content characteristic for characterizing content of text;Determining module is into one
Step is configured to: from selection in pre-training model set and the initial model in content characteristic and selected initial hyper parameter group
The matched pre-training model of hyper parameter is as initial model, wherein pre-training model is corresponding with semantic label.
In some embodiments, above-mentioned determining module further comprises: choosing submodule, is configured to from training sample set
Training sample is chosen in conjunction, and executes following training step: the sample text of the training sample of selection being input to selected
Initial model, generate text categories;According to text categories generated sample class corresponding with the sample text of input, really
Determine difference value;Determine whether difference value meets trained completion condition, wherein difference value and training completion condition are based on selected
Initial training hyper parameter in hyper parameter group determines;Meet training completion condition in response to determining, by selected initial model
It is determined as the selected corresponding quasi- textual classification model of hyper parameter group;Adjusting submodule is configured in response to determine discontented
Foot training completion condition, adjusts the relevant parameter of selected initial model, and training sample is chosen from training sample set
This, uses initial model adjusted as selected initial model, continues to execute training step.
In some embodiments, above-mentioned acquiring unit includes: receiving module, is configured to receive the mark of user terminal transmission
Text collection, wherein mark text includes text and text categories markup information corresponding with text;Third generation module, quilt
It is configured to divide mark text collection, generates training sample set and verifying text collection, wherein training sample includes
Text as sample text and the text categories markup information as sample class corresponding with sample text.
In some embodiments, device further include: receiving unit is configured to receive the text to be sorted of user terminal transmission
This set;Second generation unit is configured to text collection to be sorted being input to textual classification model, generates text to be sorted
The corresponding classification information of text to be sorted in set, wherein classification information is for characterizing classification belonging to text to be sorted, class
Other information matches with sample class;Transmission unit is configured to classification information generated and corresponding text to be sorted
Information is sent to user terminal, wherein text information to be sorted is used to identify the text to be sorted in text collection to be sorted.
The third aspect, embodiment of the disclosure provide a kind of server, which includes: one or more processing
Device;Storage device is stored thereon with one or more programs;When one or more programs are executed by one or more processors,
So that one or more processors realize the method as described in implementation any in first aspect.
Fourth aspect, embodiment of the disclosure provide a kind of computer-readable medium, are stored thereon with computer program,
The method as described in implementation any in first aspect is realized when the program is executed by processor.
The method and apparatus for generating disaggregated model that embodiment of the disclosure provides, first acquisition training sample set
It closes.Wherein, training sample includes sample text and sample class corresponding with sample text.Then, training sample set is determined
Statistical nature.Wherein, statistical nature includes the form feature for characterizing text length.Later, based on statistical nature and to initial
The training of model generates textual classification model.Wherein, textual classification model is corresponding between text categories and text for characterizing
Relationship.Above-mentioned initial model is chosen from preset pre-training model set.Without adjusting ginseng that text point can be realized manually
Class model automatically generates.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the disclosure is other
Feature, objects and advantages will become more apparent upon:
Fig. 1 is that one embodiment of the disclosure can be applied to exemplary system architecture figure therein;
Fig. 2 is the flow chart according to one embodiment of the method for generating disaggregated model of the disclosure;
Fig. 3 is according to an embodiment of the present disclosure for generating the signal of an application scenarios of the method for disaggregated model
Figure;
Fig. 4 is the flow chart according to another embodiment of the method for generating disaggregated model of the disclosure;
Fig. 5 is the structural schematic diagram according to one embodiment of the device for generating disaggregated model of the disclosure;
Fig. 6 is adapted for the structural schematic diagram for realizing the electronic equipment of embodiment of the disclosure.
Specific embodiment
The disclosure is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to
Convenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the disclosure can phase
Mutually combination.The disclosure is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 is shown can the method for generating disaggregated model using the disclosure or the dress for generating disaggregated model
The exemplary architecture 100 set.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105.
Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with
Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
Terminal device 101,102,103 is interacted by network 104 with server 105, to receive or send message etc..Terminal
Various telecommunication customer end applications can be installed in equipment 101,102,103, such as the application of web browser applications, searching class,
Instant messaging tools, mailbox client, social platform software, the application of text editing class, reading class application etc..
Terminal device 101,102,103 can be hardware, be also possible to software.When terminal device 101,102,103 is hard
When part, it can be with display screen and the various electronic equipments of supporting text to show, including but not limited to smart phone, plate
Computer, E-book reader, pocket computer on knee and desktop computer etc..When terminal device 101,102,103 is soft
When part, it may be mounted in above-mentioned cited electronic equipment.Its may be implemented into multiple softwares or software module (such as
The software or software module of Distributed Services are provided), single software or software module also may be implemented into.Specific limit is not done herein
It is fixed.
Server 105 can be to provide the server of various services, for example, text on terminal device 101,102,103
Classification application provides the background server supported.Optionally, server 105 is also possible to Cloud Server.Background server can be with
Textual classification model is obtained according to the training of the training sample set of acquisition, and the text that can be sent to terminal device is analyzed
Deng processing, and processing result (such as text categories information) is fed back into terminal device.
It should be noted that above-mentioned training sample set can also be stored directly in the local of server 105, server
105, which can directly extract the local training sample set stored, merges progress model training, at this point it is possible to which terminal device is not present
101,102,103 and network 104.
It should be noted that server can be hardware, it is also possible to software.When server is hardware, may be implemented
At the distributed server cluster that multiple servers form, individual server also may be implemented into.It, can when server is software
To be implemented as multiple softwares or software module (such as providing the software of Distributed Services or software module), also may be implemented
At single software or software module.It is not specifically limited herein.
It should be noted that for generating the method for disaggregated model generally by server provided by embodiment of the disclosure
105 execute, and correspondingly, the device for generating disaggregated model is generally positioned in server 105.
It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need
It wants, can have any number of terminal device, network and server.
With continued reference to Fig. 2, the stream of one embodiment of the method for generating disaggregated model according to the disclosure is shown
Journey 200.This be used for generate disaggregated model method the following steps are included:
Step 201, training sample set is obtained.
It in the present embodiment, can for generating the executing subject (server 105 as shown in Figure 1) of the method for disaggregated model
To obtain training sample set by wired connection mode or radio connection.Wherein, above-mentioned training sample may include
Sample text and sample class corresponding with sample text.Specifically, above-mentioned executing subject is available is pre-stored within local
Training sample set, also it is available communicate connection electronic equipment (such as terminal device shown in FIG. 1) send
Training sample set.
In practice, above-mentioned training sample can obtain in several ways.As an example, can be by technical staff to be obtained
Each text in the text collection taken carries out classification mark.Text is stored with the category associations marked, finally obtains training
Sample.As another example, the information resources in portal website can be processed.Such as the article in webpage can be made
Training sample is formed using column belonging to article as sample class for sample text.It is formed largely by a large amount of data
Training sample, and then form training sample set.
In some optional implementations of the present embodiment, above-mentioned executing subject can also obtain instruction in accordance with the following steps
Practice sample set:
The first step receives the mark text collection that user terminal is sent.
In these implementations, above-mentioned executing subject be can receive on user terminal (such as terminal device shown in FIG. 1)
The mark text collection of biography.Wherein, above-mentioned mark text may include text and text categories markup information corresponding with text.
Second step divides mark text collection, generates training sample set and verifying text collection.
In these implementations, above-mentioned executing subject can be according to a certain percentage to the received mark of above-mentioned first step institute
Note text collection is divided, to obtain training sample set and verifying text collection.Wherein, above-mentioned training sample can wrap
Include the text as sample text and the text categories markup information as sample class corresponding with sample text.On in general,
The ratio of stating can be preset, such as 8:2 or 7:3.Optionally, aforementioned proportion can also be set according to the user's choice.
Step 202, the statistical nature of training sample set is determined.
In the present embodiment, above-mentioned executing subject can determine the statistical nature of training sample set by various modes.
Wherein, above-mentioned statistical nature may include the form feature for characterizing text length.Above-mentioned statistical nature can include but is not limited to
At least one of below: the class mesh number of sample class, maximum text size, minimum text size, average text size.
In some optional implementations of the present embodiment, above-mentioned statistical nature can also include characterization content of text
Content characteristic.Above-mentioned statistical nature can include but is not limited at least one of following: word frequency vector is based on vector space model
Text eigenvector determined by (vector space model, VSM).
Step 203, based on statistical nature and to the training of initial model, textual classification model is generated.
In the present embodiment, based on statistical nature and to the training of initial model, above-mentioned executing subject can be by various
Mode generates textual classification model.Wherein, above-mentioned textual classification model can be used for characterizing pair between text categories and text
It should be related to.Above-mentioned initial model can be chosen from preset pre-training model set.It is pre- in above-mentioned pre-training model set
Training pattern can be including the mass data collection based on different field (such as finance, law, science and technology, sport) and in advance training
Model.Above-mentioned pre-training model can be used for characterizing the corresponding relationship between text categories and text.Above-mentioned pre-training model can
With knowledge such as underlying semantics, the reasonings of containing enough related fieldss.It is appreciated that the above-mentioned mass data based on different field
The pre-training model for collecting and generating can regard meta learning (the model-agnostic meta- under different field as respectively
Learning, MAML) initial token (representation).To which above-mentioned pre-training model is by subsequent training to mould
It can be used for the text classification of the subdomains of above-mentioned mass data collection fields after the adjustment of type result.
In some optional implementations of the present embodiment, above-mentioned executing subject can generate text in accordance with the following steps
Disaggregated model:
Form feature is input to hyper parameter (hyper-parameters) trained in advance and generates model, obtained by the first step
To hyper parameter group corresponding with form feature.
In these optional implementations, above-mentioned executing subject can be inputted form feature determined by step 202
Model is generated to hyper parameter trained in advance, obtains hyper parameter group corresponding with form feature.It wherein, can in above-mentioned hyper parameter group
To include model hyper parameter and training hyper parameter.Above-mentioned model hyper parameter can be used for the attribute of characterization model itself.On for example,
State model hyper parameter can include but is not limited to it is at least one of following: the number of plies of neural network, every layer of number of nodes, word in hidden layer
The dimension of vector (embedding).Above-mentioned trained hyper parameter can serve to indicate that the training process of model.For example, above-mentioned training
Hyper parameter can include but is not limited at least one of following: learning rate (learning rate), batchparameters (batch
Size), greatest gradient (clip c) is limited, dropout value (such as 0.5), L2 regular value (such as 1.0).
In these optional implementations, above-mentioned hyper parameter, which generates model, can be used for characterizing hyper parameter group and form spy
Corresponding relationship between sign.As an example, above-mentioned hyper parameter, which generates model, can be technical staff based on special to a large amount of form
The statistics for the corresponding preferable hyper parameter group of training effect of seeking peace and it is preassigned, be stored with multiple form features and hyper parameter
The mapping table of the corresponding relationship of group.As another example, above-mentioned hyper parameter generates model and can be based on a large amount of sample,
The model generated using machine learning algorithm training.Wherein, above-mentioned sample can by form feature and corresponding training effect compared with
Good hyper parameter group composition.
Second step, from selection in pre-training model set and the matched pre-training model of model hyper parameter in hyper parameter group
As initial model.
In these implementations, above-mentioned executing subject can from pre-training model set Selection Model structure with it is above-mentioned
The matched pre-training model of model hyper parameter in the obtained hyper parameter group of the first step is as initial model.Wherein, above-mentioned
With may include same or similar.For example, it is 2 that above-mentioned model hyper parameter, which can be hidden layer number, then above-mentioned initial model can be
Neural network including 2 layers of hidden layer.
Third step is trained initial model according to the training hyper parameter in hyper parameter group, generates textual classification model.
In these implementations, above-mentioned executing subject can be according to the instruction in the obtained hyper parameter group of the above-mentioned first step
The instruction for practicing hyper parameter, is trained initial model selected by above-mentioned second step using various machine learning algorithms, generates
Textual classification model.
In some optional implementations of the present embodiment, above-mentioned executing subject can also generate text in accordance with the following steps
This disaggregated model:
The first step obtains initial hyper parameter group set.
In these implementations, above-mentioned executing subject can pass through wired connection mode or radio connection first
Obtain initial hyper parameter group set.It wherein, may include initial model hyper parameter and initial training in above-mentioned initial hyper parameter group
Hyper parameter.The associated description of initial model hyper parameter and initial training hyper parameter in above-mentioned initial hyper parameter group can with it is aforementioned
Model hyper parameter in hyper parameter group is consistent with training hyper parameter, and details are not described herein again.
In these implementations, above-mentioned executing subject is available to be pre-stored within local initial hyper parameter group collection
It closes, the initial hyper parameter group set that also the available electronic equipment (such as data server) for communicating connection is sent.It can
Selection of land, above-mentioned executing subject can also obtain and the matched initial hyper parameter group of statistical nature from local or above-mentioned electronic equipment
Set.For example, can have between frequency of training threshold value in average text size and initial hyper parameter group in statistical nature
Matching relationship.Optionally, it is corresponding can also to generate each initial hyper parameter group in initial hyper parameter group at random for above-mentioned executing subject
Initial value, and then generate initial hyper parameter group set.
Second step chooses initial hyper parameter group from initial hyper parameter group set, and executes step identified below.It is above-mentioned
Determine that step may include:
S1, it is matched from selection in pre-training model set with the initial model hyper parameter in selected initial hyper parameter group
Pre-training model as initial model.
In these implementations, above-mentioned executing subject can be chosen from pre-training model set with it is selected initial
The matched one or more pre-training models of initial model hyper parameter in hyper parameter group are as initial model.Optionally, above-mentioned
Executing subject can also be chosen with the matched one or more pre-training models of statistical nature as initial model.For example, statistics
It can have matching relationship between the number of hidden nodes in maximum text size and initial hyper parameter group in feature.
Optionally, based on the content characteristic in above-mentioned statistical nature, above-mentioned executing subject can also be from pre-training Models Sets
It is chosen in conjunction matched one or more pre- with the initial model hyper parameter in content characteristic and selected initial hyper parameter group
Training pattern is as initial model.Wherein, pre-training model can be corresponding with semantic label.In general, upper semantic tags can be with
With the pre-training model in above-mentioned pre-training model set during pre-training based on the field of data set it is consistent.It is above-mentioned
Executing subject can determine the matching degree between above content feature and semantic label by various modes.On as an example,
Similarity between the word frequency vector and upper semantic tags of above-mentioned training sample set can be calculated by stating executing subject.
S2, selected initial model is instructed according to the initial training hyper parameter in selected initial hyper parameter group
Practice, generates the corresponding quasi- textual classification model of initial hyper parameter group.
Based on above-mentioned optional implementation, above-mentioned executing subject can initially surpass ginseng according to selected by above-mentioned second step
The instruction of initial training hyper parameter in array, using various machine learning algorithms to introductory die selected in above-mentioned steps S1
Type is trained, and generates the selected corresponding quasi- textual classification model of initial hyper parameter group.
Optionally, above-mentioned executing subject can also generate the selected corresponding standard of initial hyper parameter group in accordance with the following steps
Textual classification model:
Step 1 chooses training sample from training sample set, and executes following training step.Above-mentioned training step
May include:
S21, the sample text of the training sample of selection is input to selected initial model, generates text categories.
S22, according to text categories generated sample class corresponding with the sample text of input, determine difference value.
S23, determine whether difference value meets trained completion condition.
Based on above-mentioned optional implementation, above-mentioned difference value and training completion condition can be based on selected hyper parameters
Initial training hyper parameter in group determines.As an example, above-mentioned initial training hyper parameter may include loss function, training is completed
Condition threshold.Above-mentioned difference value can be determined according to loss function.Condition threshold is completed in above-mentioned training
In at least one of following: training duration threshold value, frequency of training threshold value, difference value threshold value verify the accuracy rate threshold value of text set.
S24, meet training completion condition in response to determining, selected initial model is determined as to selected hyper parameter
The corresponding quasi- textual classification model of group.
Step 2 is unsatisfactory for training completion condition, adjusts the relevant parameter of selected initial model in response to determining, with
And training sample is chosen from training sample set, use initial model adjusted as selected initial model, continues
Execute above-mentioned training step.
Based on above-mentioned optional implementation, it is unsatisfactory for training completion condition in response to determining, above-mentioned executing subject can be with
Training sample is chosen using the relevant parameter of the selected initial model of various methods adjustment, and from training sample set,
Use initial model adjusted as selected initial model, continues to execute above-mentioned training step.It should be noted that root
It is one or more according to the number for choosing sample, the method for above-mentioned adjustment relevant parameter can include but is not limited to following at least one
: batch gradient declines (batch gradient descent, BGD), stochastic gradient descent (stochastic gradient
Descent, SGD) and small lot gradient decline (mini-batch gradient descent, MBGD).
S3, quasi- textual classification model generated is evaluated and tested based on verifying text collection, generates evaluation result.
Based on above-mentioned optional implementation, above-mentioned executing subject can use verifying text collection to quasi- text generated
This disaggregated model is evaluated and tested, and then generates evaluation result.Wherein, above-mentioned verifying text collection can be pre-set.On
Stating verifying text may include text and verifying text marking information corresponding with text.Optionally, above-mentioned verifying text collection
It is also based on the division for the mark text collection that the user terminal received is sent and generates.
S4, meet hyper parameter group in response to determination evaluation result generated and determine condition, determined from hyper parameter group is met
Textual classification model is determined in the corresponding quasi- textual classification model of the evaluation result of condition.
Based on above-mentioned optional implementation, meets hyper parameter group in response to determination evaluation result generated and determine item
Part, above-mentioned executing subject can surpass ginseng from meeting according to the mode for choosing initial hyper parameter group from initial hyper parameter group set
Array, which determines, determines textual classification model in the corresponding quasi- textual classification model of the evaluation result of condition.Wherein, above-mentioned hyper parameter
It is at least one of following that group determines that condition can include but is not limited to: the accuracy rate for verifying text collection is more than preset accuracy rate threshold
The number of iterations of value, initial hyper parameter group is more than preset frequency threshold value, and the initial hyper parameter group of adjacent iteration twice is corresponding
The difference verified between the accuracy rate of text collection is less than preset discrepancy threshold.
As an example, above-mentioned executing subject can choose an initial hyper parameter every time from initial hyper parameter group set
Group, then, above-mentioned executing subject can will meet hyper parameter group and determine the corresponding quasi- textual classification model of evaluation result of condition
It is determined as textual classification model.As another example, above-mentioned executing subject can be chosen every time from initial hyper parameter group set
Multiple initial hyper parameter groups, then, meet hyper parameter group and determines that the corresponding quasi- textual classification model of the evaluation result of condition can also
It is multiple to have.Above-mentioned executing subject can usually choose the optimal quasi- textual classification model of evaluation result as text classification mould
Type.For example, the optimal accuracy rate highest that can be on verifying text collection of above-mentioned evaluation result.
Third step is unsatisfactory for hyper parameter group in response to determination evaluation result generated and determines condition, to initial hyper parameter
Hyper parameter group in group set is updated;And initial hyper parameter group is chosen from updated initial hyper parameter group set,
Continue to execute above-mentioned determining step.
In these implementations, hyper parameter group is unsatisfactory in response to determination evaluation result generated and determines condition, on
It states executing subject and can use various modes and the hyper parameter group in initial hyper parameter group set is updated;And above-mentioned execution
Main body can also choose initial hyper parameter group from updated initial hyper parameter group set, continue to execute above-mentioned determining step.
It should be noted that being one or more according to the number for choosing initial hyper parameter group, the method for above-mentioned update can
To include but is not limited at least one of following: genetic algorithm (genetic algorithm, GA), simulated annealing
(simulated annealing, SA), ant group algorithm (ant colony algorithm, ACA), Bayes's optimization
(Bayesian Optimization)。
In some optional implementations of the present embodiment, above-mentioned executing subject can also send to target terminal and characterize
The information that textual classification model training is completed.
It is according to an embodiment of the present disclosure for generating the application scenarios of the method for disaggregated model with continued reference to Fig. 3, Fig. 3
A schematic diagram.In the application scenarios of Fig. 3,301 using terminal equipment 302 of user will be with the investment type text collection marked
303 are uploaded to background server 304.Background server 304 can determine the average text of the investment type text collection 303 with mark
This length is 3000 words.Then, background server 304 can generate corresponding hyper parameter according to identified average text size
Group 3031.Next, background server 304 can be chosen from preset pre-training model set 3032 is based on financial field number
The pre-training model being trained according to collection is as initial model 3033, according to hyper parameter group 3031 generated to above-mentioned introductory die
Type 3033 is trained, and generates textual classification model 305.Optionally, background server 304 can also be sent out to terminal device 302
Send the characterization model prompt information 306 that training is completed.
Currently, one of prior art is usually to obtain feature term vector by complicated Feature Engineering technology or according to default
Hyper parameter initial model is trained.Since the design of Feature Engineering and the selection of hyper parameter generally require user with rich
Rich modeling experience, causes the technical threshold of training text disaggregated model higher.And the side provided by the above embodiment of the disclosure
Method, by determine training sample set statistical nature, according to statistical nature carry out pre-training model selection and according to system
The mode of meter characteristic matching is trained.It realizes and text classification mould is generated by the fine tuning of pre-training model and small-scale sample
Type.Moreover, adjusting ginseng without artificial in the training process, significantly reduce user uses threshold.In turn, family can be used low
The text classification application for being suitble to self-demand is quickly landed under the investment of cost.
With further reference to Fig. 4, it illustrates the processes 400 of another embodiment of the method for generating disaggregated model.
This is used to generate the process 400 of the method for disaggregated model, comprising the following steps:
Step 401, training sample set is obtained.
Step 402, the statistical nature of training sample set is determined.
Step 403, based on statistical nature and to the training of initial model, textual classification model is generated.
Above-mentioned steps 401, step 402, step 403 respectively with step 201, step 202, the step in previous embodiment
203 is consistent, and the description above with respect to step 201, step 202 and step 203 is also applied for step 401, step 402 and step
403, details are not described herein again.
Step 404, the text collection to be sorted that user terminal is sent is received.
In the present embodiment, for generating the executing subject (such as server 105 shown in FIG. 1) of the method for disaggregated model
Can receive that user terminal (such as terminal device shown in FIG. 1) send by wired connection mode or radio connection to
Classifying text set.
It should be noted that above-mentioned steps 404 and step 401 can be basically executed in parallel, can also first carry out above-mentioned
Step 404, above-mentioned steps 401 are executed then.
Step 405, text collection to be sorted is input to textual classification model, generate in text collection to be sorted to point
The corresponding classification information of class text.
In the present embodiment, the received text collection to be sorted of step 405 institute can be input to step by above-mentioned executing subject
Rapid 403 textual classification model generated generates the corresponding classification information of text to be sorted in text collection to be sorted.Its
In, above-mentioned classification information can be used for characterizing classification belonging to text to be sorted.It is appreciated that due to above-mentioned textual classification model
It is obtained based on the training of training sample set acquired in above-mentioned steps 401, thus the corresponding classification of text to be sorted generated
Information can match with the sample class of above-mentioned training sample.
Step 406, classification information generated is sent to user terminal with corresponding text information to be sorted.
In the present embodiment, above-mentioned executing subject can will be generated by wired connection mode or radio connection
Classification information be sent to user terminal with corresponding text information to be sorted.Wherein, above-mentioned text information to be sorted can be used for
Identify the text to be sorted in above-mentioned text collection to be sorted.
Figure 4, it is seen that the process 400 of the method for generating disaggregated model in the present embodiment embodies utilization
The step of textual classification model generated classifies to the text to be sorted that user uploads.The present embodiment describes as a result,
The text training text disaggregated model of mark that scheme can be uploaded by user, and then use trained textual classification model
Classify to the text that do not mark of upload, to realize without artificial tune ginseng, has using existing sample
Pointedly train and apply required textual classification model.It thereby reduces the training of textual classification model and uses door
Sill.
With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, present disclose provides for generating classification mould
One embodiment of the device of type, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which can specifically answer
For in various electronic equipments.
As shown in figure 5, the device 500 provided in this embodiment for generating disaggregated model includes acquiring unit 501, determines
Unit 502 and the first generation unit 503.Wherein, acquiring unit 501 are configured to obtain training sample set, wherein training
Sample includes sample text and sample class corresponding with sample text;Determination unit 502, is configured to determine training sample set
The statistical nature of conjunction, wherein statistical nature includes the form feature for characterizing text length;First generation unit 503, is configured to
Based on statistical nature and to the training of initial model, textual classification model is generated, wherein textual classification model is for characterizing text
Corresponding relationship between classification and text, initial model are chosen from preset pre-training model set.
In the present embodiment, in the device 500 for generating disaggregated model: acquiring unit 501, determination unit 502 and
The specific processing of one generation unit 503 and its brought technical effect can be respectively with reference to the steps in Fig. 2 corresponding embodiment
201, step 202, the related description of step 203, details are not described herein.
In some optional implementations of the present embodiment, above-mentioned first generation unit 503 may include the first generation
Module (not shown) chooses module (not shown), the second generation module (not shown).Wherein, above-mentioned first
Generation module may be configured to for form feature being input to hyper parameter trained in advance and generate model, obtains and form feature
Corresponding hyper parameter group.It wherein, may include model hyper parameter and training hyper parameter in above-mentioned hyper parameter group.Above-mentioned selection mould
Block may be configured to from selection in pre-training model set and the matched pre-training model of model hyper parameter in hyper parameter group
As initial model.Above-mentioned second generation module may be configured to according to the training hyper parameter in hyper parameter group to introductory die
Type is trained, and generates textual classification model.
In some optional implementations of the present embodiment, above-mentioned first generation unit 503 may include obtaining module
(not shown), determining module (not shown), update module (not shown).Wherein, above-mentioned acquisition module, can be with
It is configured to obtain initial hyper parameter group set.Wherein, may include in above-mentioned initial hyper parameter group initial model hyper parameter and
Initial training hyper parameter.Above-mentioned determining module may be configured to choose initial hyper parameter group from initial hyper parameter group set,
And execute step identified below: from selection in pre-training model set and the initial model in selected initial hyper parameter group
The matched pre-training model of hyper parameter is as initial model;According to the initial training hyper parameter in selected initial hyper parameter group
Selected initial model is trained, the corresponding quasi- textual classification model of initial hyper parameter group is generated;Based on verifying text
Set evaluates and tests quasi- textual classification model generated, generates evaluation result;In response to determination evaluation result generated
Meet hyper parameter group and determine condition, is determined in the corresponding quasi- textual classification model of evaluation result of condition really from hyper parameter group is met
Determine textual classification model.Above-mentioned update module may be configured to be unsatisfactory for super ginseng in response to determination evaluation result generated
Array determines condition, is updated to the initial hyper parameter group in initial hyper parameter group set;From updated initial hyper parameter
Initial hyper parameter group is chosen in group set, continues to execute determining step.
In some optional implementations of the present embodiment, above-mentioned statistical nature can also include characterization content of text
Content characteristic.Above-mentioned determining module can be further configured to: be chosen and content characteristic and institute from pre-training model set
The matched pre-training model of initial model hyper parameter in initial hyper parameter group chosen is as initial model.Wherein, above-mentioned pre-
Training pattern can be corresponding with semantic label.
In some optional implementations of the present embodiment, above-mentioned determining module be may further include: choose submodule
Block (not shown), adjusting submodule (not shown).Wherein, above-mentioned selection submodule, may be configured to from training
Training sample is chosen in sample set, and executes following training step: the sample text of the training sample of selection is input to
Selected initial model generates text categories;According to text categories generated sample corresponding with the sample text of input
Classification determines difference value;Determine whether difference value meets trained completion condition, wherein difference value and training completion condition are based on
Initial training hyper parameter in selected hyper parameter group determines;Meet training completion condition in response to determining, it will be selected
Initial model is determined as the selected corresponding quasi- textual classification model of hyper parameter group.Above-mentioned adjusting submodule can be configured
It is unsatisfactory for training completion condition in response to determining, adjusts the relevant parameter of selected initial model, and from training sample
Training sample is chosen in set, is used initial model adjusted as selected initial model, is continued to execute training step.
In some optional implementations of the present embodiment, above-mentioned acquiring unit 501 may include: receiving module (figure
In be not shown), third generation module (not shown).Wherein, above-mentioned receiving module may be configured to receive user terminal hair
The mark text collection sent.Wherein, above-mentioned mark text may include text and text categories markup information corresponding with text.
Above-mentioned generation module may be configured to divide mark text collection, generate training sample set and verifying text set
It closes.Wherein, above-mentioned training sample may include as the text of sample text and as sample class corresponding with sample text
Text categories markup information.
In some optional implementations of the present embodiment, the above-mentioned device 500 for generating disaggregated model be can wrap
It includes: receiving unit (not shown), the second generation unit (not shown), transmission unit (not shown).Wherein, on
Receiving unit is stated, may be configured to receive the text collection to be sorted that user terminal is sent.Above-mentioned second generation unit, can be by
It is configured to text collection to be sorted being input to textual classification model, the text to be sorted generated in text collection to be sorted is corresponding
Classification information.Wherein, above-mentioned classification information can be used for characterizing classification belonging to text to be sorted.Above-mentioned classification information can be with
Match with sample class.Above-mentioned transmission unit may be configured to classification information generated and corresponding text to be sorted
This information is sent to user terminal.Wherein, above-mentioned text information to be sorted can be used for identifying in text collection to be sorted to point
Class text.
The device provided by the above embodiment of the disclosure obtains training sample set by acquiring unit 501.Wherein, it instructs
Practicing sample includes sample text and sample class corresponding with sample text.Then, determination unit 502 determines training sample set
Statistical nature.Wherein, statistical nature includes the form feature for characterizing text length.Later, the first generation unit 503 is based on system
Feature and the training to initial model are counted, textual classification model is generated.Wherein, textual classification model for characterize text categories with
Corresponding relationship between text.Initial model is chosen from preset pre-training model set.Without adjusting ginseng manually
Realize automatically generating for textual classification model.
Below with reference to Fig. 6, it illustrates the electronic equipment that is suitable for being used to realize embodiment of the disclosure, (example is as shown in figure 1
Server) 600 structural schematic diagram.Server shown in Fig. 6 is only an example, should not be to the function of embodiment of the disclosure
Any restrictions can be brought with use scope.
As shown in fig. 6, electronic equipment 600 may include processing unit (such as central processing unit, graphics processor etc.)
601, random access can be loaded into according to the program being stored in read-only memory (ROM) 602 or from storage device 608
Program in memory (RAM) 603 and execute various movements appropriate and processing.In RAM 603, it is also stored with electronic equipment
Various programs and data needed for 600 operations.Processing unit 601, ROM 602 and RAM 603 pass through the phase each other of bus 604
Even.Input/output (I/O) interface 605 is also connected to bus 604.
In general, following device can connect to I/O interface 605: including such as touch screen, touch tablet, keyboard, mouse, taking the photograph
As the input unit 606 of head, microphone etc.;Including such as liquid crystal display (LCD, Liquid Crystal Display), raise
The output device 607 of sound device, vibrator etc.;Storage device 608 including such as tape, hard disk etc.;And communication device 609.
Communication device 609 can permit electronic equipment 600 and wirelessly or non-wirelessly be communicated with other equipment to exchange data.Although Fig. 6
The electronic equipment 600 with various devices is shown, it should be understood that being not required for implementing or having all dresses shown
It sets.It can alternatively implement or have more or fewer devices.Each box shown in Fig. 6 can represent a device,
Also it can according to need and represent multiple devices.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium
On computer program, which includes the program code for method shown in execution flow chart.In such reality
It applies in example, which can be downloaded and installed from network by communication device 609, or from storage device 608
It is mounted, or is mounted from ROM 602.When the computer program is executed by processing unit 601, the implementation of the disclosure is executed
The above-mentioned function of being limited in the method for example.
It is situated between it should be noted that computer-readable medium described in embodiment of the disclosure can be computer-readable signal
Matter or computer readable storage medium either the two any combination.Computer readable storage medium for example can be with
System, device or the device of --- but being not limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or it is any more than
Combination.The more specific example of computer readable storage medium can include but is not limited to: have one or more conducting wires
Electrical connection, portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type are programmable
Read-only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic are deposited
Memory device or above-mentioned any appropriate combination.In embodiment of the disclosure, computer readable storage medium, which can be, appoints
What include or the tangible medium of storage program that the program can be commanded execution system, device or device use or and its
It is used in combination.And in embodiment of the disclosure, computer-readable signal media may include in a base band or as carrier wave
The data-signal that a part is propagated, wherein carrying computer-readable program code.The data-signal of this propagation can be adopted
With diversified forms, including but not limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal is situated between
Matter can also be any computer-readable medium other than computer readable storage medium, which can be with
It sends, propagate or transmits for by the use of instruction execution system, device or device or program in connection.Meter
The program code for including on calculation machine readable medium can transmit with any suitable medium, including but not limited to: electric wire, optical cable,
RF (Radio Frequency, radio frequency) etc. or above-mentioned any appropriate combination.
Above-mentioned computer-readable medium can be included in above-mentioned server;It is also possible to individualism, and without
It is incorporated in the server.Above-mentioned computer-readable medium carries one or more program, when said one or multiple journeys
Sequence by the server execute when so that the server: obtain training sample set, wherein training sample include sample text and
Sample class corresponding with sample text;Determine the statistical nature of training sample set, wherein statistical nature includes characterization text
The form feature of length;Based on statistical nature and to the training of initial model, textual classification model is generated, wherein text classification
Model is used to characterize the corresponding relationship between text categories and text, and initial model is selected from preset pre-training model set
It takes.
The behaviour for executing embodiment of the disclosure can be write with one or more programming languages or combinations thereof
The computer program code of work, described program design language include object oriented program language-such as Java,
Smalltalk, C++ further include conventional procedural programming language-such as " C " language or similar program design language
Speech.Program code can be executed fully on the user computer, partly be executed on the user computer, as an independence
Software package execute, part on the user computer part execute on the remote computer or completely in remote computer or
It is executed on server.In situations involving remote computers, remote computer can pass through the network of any kind --- packet
It includes local area network (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as benefit
It is connected with ISP by internet).
Flow chart and block diagram in attached drawing illustrate system, method and the computer of the various embodiments according to the disclosure
The architecture, function and operation in the cards of program product.In this regard, each box in flowchart or block diagram can be with
A part of a module, program segment or code is represented, a part of the module, program segment or code includes one or more
Executable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in box
The function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practical
On can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it wants
It is noted that the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart, Ke Yiyong
The dedicated hardware based system of defined functions or operations is executed to realize, or can be referred to specialized hardware and computer
The combination of order is realized.
Being described in unit involved in embodiment of the disclosure can be realized by way of software, can also be passed through
The mode of hardware is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor,
Including acquiring unit, determination unit and the first generation unit.Wherein, the title of these units under certain conditions constitute pair
The restriction of the unit itself, for example, acquiring unit is also described as " obtaining the unit of training sample set, wherein training
Sample includes sample text and sample class corresponding with sample text ".
Above description is only the preferred embodiment of the disclosure and the explanation to institute's application technology principle.Those skilled in the art
Member it should be appreciated that embodiment of the disclosure involved in invention scope, however it is not limited to the specific combination of above-mentioned technical characteristic and
At technical solution, while should also cover do not depart from foregoing invention design in the case where, by above-mentioned technical characteristic or its be equal
Feature carries out any combination and other technical solutions for being formed.Such as disclosed in features described above and embodiment of the disclosure (but
It is not limited to) technical characteristic with similar functions is replaced mutually and the technical solution that is formed.
Claims (16)
1. a kind of method for generating disaggregated model, comprising:
Obtain training sample set, wherein training sample includes sample text and sample class corresponding with sample text;
Determine the statistical nature of the training sample set, wherein the statistical nature includes the form spy for characterizing text length
Sign;
Based on the statistical nature and to the training of initial model, textual classification model is generated, wherein the textual classification model
For characterizing the corresponding relationship between text categories and text, the initial model is selected from preset pre-training model set
It takes.
2. it is described based on the statistical nature and to the training of initial model according to the method described in claim 1, wherein, it is raw
At textual classification model, comprising:
The form feature is input to hyper parameter trained in advance and generates model, obtains super ginseng corresponding with the form feature
Array, wherein include model hyper parameter and training hyper parameter in hyper parameter group;
Make from being chosen in the pre-training model set with the matched pre-training model of model hyper parameter in the hyper parameter group
For initial model;
The initial model is trained according to the training hyper parameter in the hyper parameter group, generates the text classification mould
Type.
3. it is described based on the statistical nature and to the training of initial model according to the method described in claim 1, wherein, it is raw
At textual classification model, comprising:
Obtain initial hyper parameter group set, wherein include initial model hyper parameter and the super ginseng of initial training in initial hyper parameter group
Number;
Initial hyper parameter group is chosen from the initial hyper parameter group set, and executes step identified below: from the pre- instruction
Practice and is chosen in model set and the matched pre-training model conduct of initial model hyper parameter in selected initial hyper parameter group
Initial model;Selected initial model is instructed according to the initial training hyper parameter in selected initial hyper parameter group
Practice, generates the corresponding quasi- textual classification model of initial hyper parameter group;Based on verifying text collection to quasi- text classification generated
Model is evaluated and tested, and evaluation result is generated;Meet hyper parameter group in response to determination evaluation result generated and determine condition, from completely
The foot hyper parameter group, which determines, determines the textual classification model in the corresponding quasi- textual classification model of the evaluation result of condition;
The hyper parameter group is unsatisfactory in response to determination evaluation result generated and determines condition, to the initial hyper parameter group collection
Initial hyper parameter group in conjunction is updated;Initial hyper parameter group is chosen from updated initial hyper parameter group set, is continued
Execute the determining step.
4. according to the method described in claim 3, wherein, the statistical nature further includes the content characteristic for characterizing content of text;
And
The initial model hyper parameter from the pre-training model set in selection and selected initial hyper parameter group
The pre-training model matched is as initial model, comprising:
From selection in the pre-training model set and the introductory die in the content characteristic and selected initial hyper parameter group
The matched pre-training model of type hyper parameter is as initial model, wherein pre-training model is corresponding with semantic label.
5. according to the method described in claim 3, wherein, the initial training in the initial hyper parameter group according to selected by is super
Parameter is trained selected initial model, generates the corresponding quasi- textual classification model of initial hyper parameter group, comprising:
Training sample is chosen from the training sample set, and executes following training step: by the training sample of selection
Sample text is input to selected initial model, generates text categories;According to the sample of text categories generated and input
The corresponding sample class of text, determines difference value;Determine whether difference value meets trained completion condition, wherein the difference value
It is determined with training completion condition based on the initial training hyper parameter in selected hyper parameter group;Meet the instruction in response to determination
Practice completion condition, selected initial model is determined as to the selected corresponding quasi- textual classification model of hyper parameter group;
It is unsatisfactory for the trained completion condition in response to determining, adjusts the relevant parameter of selected initial model, and from institute
It states in training sample set and chooses training sample, use initial model adjusted as selected initial model, continue to hold
The row training step.
6. according to the method described in claim 3, wherein, the acquisition training sample set, comprising:
Receive the mark text collection that user terminal is sent, wherein mark text includes text and text categories corresponding with text
Markup information;
The mark text collection is divided, the training sample set and the verifying text collection are generated, wherein instruction
Practicing sample includes the text as sample text and the text categories markup information as sample class corresponding with sample text.
7. method described in one of -6 according to claim 1, wherein the method also includes:
Receive the text collection to be sorted that user terminal is sent;
The text collection to be sorted is input to the textual classification model, generate in the text collection to be sorted to point
The corresponding classification information of class text, wherein classification information is for characterizing classification belonging to text to be sorted, classification information and sample
Classification matches;
Classification information generated is sent to the user terminal with corresponding text information to be sorted, wherein text to be sorted
Information is used to identify the text to be sorted in the text collection to be sorted.
8. a kind of for generating the device of disaggregated model, comprising:
Acquiring unit is configured to obtain training sample set, wherein training sample include sample text and with sample text pair
The sample class answered;
Determination unit is configured to determine the statistical nature of the training sample set, wherein the statistical nature includes characterization
The form feature of text length;
First generation unit is configured to generate textual classification model based on the statistical nature and to the training of initial model,
Wherein, the textual classification model is used to characterize corresponding relationship between text categories and text, and the initial model is from default
Pre-training model set in choose.
9. device according to claim 8, wherein first generation unit includes:
First generation module is configured to for the form feature being input in advance trained hyper parameter and generates model, obtain with
The corresponding hyper parameter group of the form feature, wherein include model hyper parameter and training hyper parameter in hyper parameter group;
Module is chosen, is configured to from selection in the pre-training model set and the model hyper parameter in the hyper parameter group
The pre-training model matched is as initial model;
Second generation module is configured to instruct the initial model according to the training hyper parameter in the hyper parameter group
Practice, generates the textual classification model.
10. device according to claim 8, wherein first generation unit includes:
Module is obtained, is configured to obtain initial hyper parameter group set, wherein includes the super ginseng of initial model in initial hyper parameter group
Several and initial training hyper parameter;
Determining module is configured to choose initial hyper parameter group from the initial hyper parameter group set, and executes following true
Determine step: being matched from being chosen in the pre-training model set with the initial model hyper parameter in selected initial hyper parameter group
Pre-training model as initial model;According to the initial training hyper parameter in selected initial hyper parameter group to selected
Initial model is trained, and generates the corresponding quasi- textual classification model of initial hyper parameter group;Based on verifying text collection to giving birth to
At quasi- textual classification model evaluated and tested, generate evaluation result;Meet hyper parameter in response to determination evaluation result generated
The condition of determination is organized, is determined in the corresponding quasi- textual classification model of evaluation result of condition described in determination from the hyper parameter group is met
Textual classification model;
Update module is configured in response to determine that evaluation result generated is unsatisfactory for the hyper parameter group and determines condition, right
Initial hyper parameter group in the initial hyper parameter group set is updated;It is chosen from updated initial hyper parameter group set
Initial hyper parameter group, continues to execute the determining step.
11. device according to claim 10, wherein the statistical nature further includes the content spy for characterizing content of text
Sign;The determining module is further configured to:
From selection in the pre-training model set and the introductory die in the content characteristic and selected initial hyper parameter group
The matched pre-training model of type hyper parameter is as initial model, wherein pre-training model is corresponding with semantic label.
12. device according to claim 10, wherein the determining module further comprises:
Submodule is chosen, is configured to choose training sample from the training sample set, and execute following training step:
The sample text of the training sample of selection is input to selected initial model, generates text categories;According to text generated
This classification sample class corresponding with the sample text of input, determines difference value;Determine whether difference value meets training and complete item
Part, wherein the difference value and training completion condition are determined based on the initial training hyper parameter in selected hyper parameter group;It rings
The trained completion condition should be met in determining, selected initial model is determined as to the selected corresponding standard of hyper parameter group
Textual classification model;
Adjusting submodule is configured in response to determination and is unsatisfactory for the trained completion condition, adjusts selected initial model
Relevant parameter, and choose training sample from the training sample set, use initial model adjusted selected by
The initial model taken continues to execute the training step.
13. device according to claim 10, wherein the acquiring unit includes:
Receiving module, be configured to receive user terminal transmission mark text collection, wherein mark text include text and with text
This corresponding text categories markup information;
Third generation module is configured to divide the mark text collection, generates the training sample set and institute
State verifying text collection, wherein training sample includes as the text of sample text and as sample corresponding with sample text
The text categories markup information of classification.
14. the device according to one of claim 8-13, wherein described device further include:
Receiving unit is configured to receive the text collection to be sorted of user terminal transmission;
Second generation unit, is configured to for the text collection to be sorted being input to the textual classification model, described in generation
The corresponding classification information of text to be sorted in text collection to be sorted, wherein classification information is for characterizing text institute to be sorted
The classification of category, classification information match with sample class;
Transmission unit is configured to classification information generated being sent to the user with corresponding text information to be sorted
End, wherein text information to be sorted is used to identify the text to be sorted in the text collection to be sorted.
15. a kind of server, comprising:
One or more processors;
Storage device is stored thereon with one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
The now method as described in any in claim 1-7.
16. a kind of computer-readable medium, is stored thereon with computer program, wherein the realization when program is executed by processor
Method as described in any in claim 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910721353.3A CN110457476A (en) | 2019-08-06 | 2019-08-06 | Method and apparatus for generating disaggregated model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910721353.3A CN110457476A (en) | 2019-08-06 | 2019-08-06 | Method and apparatus for generating disaggregated model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110457476A true CN110457476A (en) | 2019-11-15 |
Family
ID=68485051
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910721353.3A Pending CN110457476A (en) | 2019-08-06 | 2019-08-06 | Method and apparatus for generating disaggregated model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110457476A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111241280A (en) * | 2020-01-07 | 2020-06-05 | 支付宝(杭州)信息技术有限公司 | Training method of text classification model and text classification method |
CN111563163A (en) * | 2020-04-29 | 2020-08-21 | 厦门市美亚柏科信息股份有限公司 | Text classification model generation method and device and data standardization method and device |
CN111696517A (en) * | 2020-05-28 | 2020-09-22 | 平安科技(深圳)有限公司 | Speech synthesis method, speech synthesis device, computer equipment and computer readable storage medium |
CN113761181A (en) * | 2020-06-15 | 2021-12-07 | 北京京东振世信息技术有限公司 | Text classification method and device |
WO2023109828A1 (en) * | 2021-12-15 | 2023-06-22 | 维沃移动通信有限公司 | Data collection method and apparatus, and first device and second device |
US11748597B1 (en) * | 2022-03-17 | 2023-09-05 | Sas Institute, Inc. | Computerized engines and graphical user interfaces for customizing and validating forecasting models |
-
2019
- 2019-08-06 CN CN201910721353.3A patent/CN110457476A/en active Pending
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111241280A (en) * | 2020-01-07 | 2020-06-05 | 支付宝(杭州)信息技术有限公司 | Training method of text classification model and text classification method |
CN111241280B (en) * | 2020-01-07 | 2023-09-05 | 支付宝(杭州)信息技术有限公司 | Training method of text classification model and text classification method |
CN111563163A (en) * | 2020-04-29 | 2020-08-21 | 厦门市美亚柏科信息股份有限公司 | Text classification model generation method and device and data standardization method and device |
CN111696517A (en) * | 2020-05-28 | 2020-09-22 | 平安科技(深圳)有限公司 | Speech synthesis method, speech synthesis device, computer equipment and computer readable storage medium |
CN113761181A (en) * | 2020-06-15 | 2021-12-07 | 北京京东振世信息技术有限公司 | Text classification method and device |
WO2023109828A1 (en) * | 2021-12-15 | 2023-06-22 | 维沃移动通信有限公司 | Data collection method and apparatus, and first device and second device |
US11748597B1 (en) * | 2022-03-17 | 2023-09-05 | Sas Institute, Inc. | Computerized engines and graphical user interfaces for customizing and validating forecasting models |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110457476A (en) | Method and apparatus for generating disaggregated model | |
CN108197652B (en) | Method and apparatus for generating information | |
CN109325541A (en) | Method and apparatus for training pattern | |
CN110288049A (en) | Method and apparatus for generating image recognition model | |
CN107491534A (en) | Information processing method and device | |
CN110458107A (en) | Method and apparatus for image recognition | |
CN108121800A (en) | Information generating method and device based on artificial intelligence | |
CN110555714A (en) | method and apparatus for outputting information | |
CN109976997A (en) | Test method and device | |
CN106484766B (en) | Searching method and device based on artificial intelligence | |
CN108734293A (en) | Task management system, method and apparatus | |
CN108960316A (en) | Method and apparatus for generating model | |
CN108256476A (en) | For identifying the method and apparatus of fruits and vegetables | |
CN109961032A (en) | Method and apparatus for generating disaggregated model | |
CN109933217A (en) | Method and apparatus for pushing sentence | |
CN109299477A (en) | Method and apparatus for generating text header | |
CN109766418A (en) | Method and apparatus for output information | |
CN109902446A (en) | Method and apparatus for generating information prediction model | |
CN108121699A (en) | For the method and apparatus of output information | |
CN110084317A (en) | The method and apparatus of image for identification | |
CN109190123A (en) | Method and apparatus for output information | |
CN112418059A (en) | Emotion recognition method and device, computer equipment and storage medium | |
CN109214501A (en) | The method and apparatus of information for identification | |
CN109117758A (en) | Method and apparatus for generating information | |
CN110727871A (en) | Multi-mode data acquisition and comprehensive analysis platform based on convolution decomposition depth model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |