CN108984692B - The processing method and processing device of webpage, storage medium, electronic device - Google Patents

The processing method and processing device of webpage, storage medium, electronic device Download PDF

Info

Publication number
CN108984692B
CN108984692B CN201810725034.5A CN201810725034A CN108984692B CN 108984692 B CN108984692 B CN 108984692B CN 201810725034 A CN201810725034 A CN 201810725034A CN 108984692 B CN108984692 B CN 108984692B
Authority
CN
China
Prior art keywords
value
population
individual
webpage
language
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810725034.5A
Other languages
Chinese (zh)
Other versions
CN108984692A (en
Inventor
张峰
聂颖
郑权
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dragon Horse Zhixin (zhuhai Hengqin) Technology Co Ltd
Original Assignee
Dragon Horse Zhixin (zhuhai Hengqin) Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dragon Horse Zhixin (zhuhai Hengqin) Technology Co Ltd filed Critical Dragon Horse Zhixin (zhuhai Hengqin) Technology Co Ltd
Priority to CN201810725034.5A priority Critical patent/CN108984692B/en
Publication of CN108984692A publication Critical patent/CN108984692A/en
Application granted granted Critical
Publication of CN108984692B publication Critical patent/CN108984692B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The present invention provides a kind of processing method and processing device of webpage, storage medium, electronic devices, wherein this method comprises: obtaining in training sample, there are the text attribute values of the webpage of first language;By the first parameter value perceptually device neural network input variable with determine be used to indicate webpage whether be the text based on first language third parameter value;The adaptive value of population at individual in perceptron neural network is determined according to the second parameter value and third parameter value;The individual optimal to adaptive value in population is decoded to obtain the connection weight of perceptron neural network and bias;Based on connection weight and bias determine webpage to be processed whether based on first language text.Through the invention, it solves set in advance for extracting the characteristics of parameter of webpage is rule of thumb with structure of web page in the related technology, therefore can achieve the effect that improve user experience due to the problem of the inaccuracy of the improper extraction for leading to web page text of parameter setting.

Description

The processing method and processing device of webpage, storage medium, electronic device
Technical field
The present invention relates to the communications fields, in particular to a kind of processing method and processing device of webpage, storage medium, electricity Sub-device.
Background technique
In the scheme for the extraction webpage text content that the prior art provides, webpage is in a browser after loaded, by net Content in page is split, and is then positioned by the matching rule file in browser to web page contents, needed for extracting Field contents and show, so that user can see the webpage after text screening, allow users to convenient and be absorbed in It reads.
But it at least has following defects that and is usually needed according to related section in the existing scheme for extracting webpage text content The characteristics of grinding the experience and structure of web page of personnel, setup parameter;These methods are very high to the setting requirements of parameter, if parameter setting Improper, then web page text extracts inaccuracy.
In view of the above problems in the related art, not yet there is effective solution at present.
Summary of the invention
The embodiment of the invention provides a kind of processing method and processing device of webpage, storage medium, electronic devices, at least to solve It is certainly set in advance for extracting the characteristics of parameter of webpage is rule of thumb with structure of web page in the related technology, therefore can be by In the problem of the inaccuracy of the improper extraction for leading to web page text of parameter setting.
According to one embodiment of present invention, a kind of processing method of webpage is provided, comprising: obtain and deposit in training sample In the text attribute value of the webpage of first language, wherein the text attribute value includes: to be used to indicate in the webpage and institute State corresponding first parameter value of first language, be used to indicate the webpage whether based on first language text the second parameter Value;By first parameter value perceptually device neural network input variable with determine be used to indicate the webpage whether be with The third parameter value of text based on first language;The perception is determined according to second parameter value and the third parameter value The adaptive value of population at individual in device neural network;The individual optimal to adaptive value in the population is decoded to obtain the perception The connection weight and bias of device neural network;Determine webpage to be processed whether with first based on the connection weight and bias Text based on language.
According to another embodiment of the invention, a kind of processing unit of webpage is provided, comprising: first obtains module, For obtaining the text attribute value in training sample there are the webpage of first language, wherein the text attribute value includes: to be used for It indicates the first parameter value corresponding with the first language in the webpage, whether be used to indicate the webpage with first language Based on text the second parameter value;First determining module, for by first parameter value perceptually device neural network Input variable determine be used to indicate the webpage whether be the text based on first language third parameter value;Second determines mould Block, for determining the suitable of population at individual in the perceptron neural network according to second parameter value and the third parameter value It should be worth;Decoder module, for being decoded to obtain the perceptron neural network to the optimal individual of adaptive value in the population Connection weight and bias;Third determining module, for determining that webpage to be processed is based on the connection weight and bias The no text based on first language.
According to still another embodiment of the invention, a kind of storage medium is additionally provided, meter is stored in the storage medium Calculation machine program, wherein the computer program is arranged to execute the step in any of the above-described embodiment of the method when operation.
According to still another embodiment of the invention, a kind of electronic device, including memory and processor are additionally provided, it is described Computer program is stored in memory, the processor is arranged to run the computer program to execute any of the above-described Step in embodiment of the method.
Through the invention, by the text attribute value in acquisition training sample there are the webpage of first language, based on the determination The adaptive value of population at individual in perceptron neural network, and then determine the connection weight and bias of perceptron neural network, from And in pending web page text, which can be determined by the connection weight and bias of the perceptron neural network Page text whether based on first language text, it is seen then that for the determination of web page body text do not need according in advance setting Parameter determines, but determines the main text of webpage by the perceptron neural network of training, to solve related skill It is set in advance for extracting the characteristics of parameter of webpage is rule of thumb with structure of web page in art, therefore can be set due to parameter The problem of the inaccuracy for the improper extraction for leading to web page text set has achieved the effect that improve user experience.
Detailed description of the invention
The drawings described herein are used to provide a further understanding of the present invention, constitutes part of this application, this hair Bright illustrative embodiments and their description are used to explain the present invention, and are not constituted improper limitations of the present invention.In the accompanying drawings:
Fig. 1 is the hardware block diagram of the terminal of the processing method of the webpage of the embodiment of the present invention;
Fig. 2 is the processing method flow chart of webpage according to an embodiment of the present invention;
Fig. 3 is the structural block diagram of the processing unit of webpage according to an embodiment of the present invention.
Specific embodiment
Hereinafter, the present invention will be described in detail with reference to the accompanying drawings and in combination with Examples.It should be noted that not conflicting In the case of, the features in the embodiments and the embodiments of the present application can be combined with each other.
It should be noted that description and claims of this specification and term " first " in above-mentioned attached drawing, " Two " etc. be to be used to distinguish similar objects, without being used to describe a particular order or precedence order.
Embodiment 1
Embodiment of the method provided by the embodiment of the present application one can be in mobile terminal, terminal or similar fortune It calculates and is executed in device.For running on mobile terminals, Fig. 1 is a kind of end of the processing method of webpage of the embodiment of the present invention The hardware block diagram at end.As shown in Figure 1, mobile terminal 10 may include one or more (only showing one in Fig. 1) processing Device 102 (processing unit that processor 102 can include but is not limited to Micro-processor MCV or programmable logic device FPGA etc.) and Memory 104 for storing data, optionally, above-mentioned mobile terminal can also include the transmission device for communication function 106 and input-output equipment 108.It will appreciated by the skilled person that structure shown in FIG. 1 is only to illustrate, simultaneously The structure of above-mentioned mobile terminal is not caused to limit.For example, mobile terminal 10 may also include it is more than shown in Fig. 1 or less Component, or with the configuration different from shown in Fig. 1.
Memory 104 can be used for storing computer program, for example, the software program and module of application software, such as this hair The corresponding computer program of the processing method of webpage in bright embodiment, processor 102 are stored in memory 104 by operation Computer program realize above-mentioned method thereby executing various function application and data processing.Memory 104 can wrap Include high speed random access memory, may also include nonvolatile memory, as one or more magnetic storage device, flash memory or Other non-volatile solid state memories.In some instances, memory 104 can further comprise long-range relative to processor 102 The memory of setting, these remote memories can pass through network connection to mobile terminal 10.The example of above-mentioned network include but It is not limited to internet, intranet, local area network, mobile radio communication and combinations thereof.
Transmitting device 106 is used to that data to be received or sent via a network.Above-mentioned network specific example may include The wireless network that the communication providers of mobile terminal 10 provide.In an example, transmitting device 106 includes a Network adaptation Device (Network Interface Controller, referred to as NIC), can be connected by base station with other network equipments to It can be communicated with internet.In an example, transmitting device 106 can for radio frequency (Radio Frequency, referred to as RF) module is used to wirelessly be communicated with internet.
A kind of processing method of webpage is provided in the present embodiment, and Fig. 2 is the place of webpage according to an embodiment of the present invention Method flow diagram is managed, as shown in Fig. 2, the process includes the following steps:
Step S202, obtain training sample in there are the text attribute values of the webpage of first language, wherein text attribute value Include: be used to indicate in webpage the first parameter value corresponding with first language, be used to indicate webpage whether with first language be Second parameter value of main text;
The input variable of first parameter value perceptually device neural network is used to indicate webpage and is by step S204 to determine The no third parameter value for the text based on first language;
Step S206 determines the adaptation of population at individual in perceptron neural network according to the second parameter value and third parameter value Value;
Step S208, the individual optimal to adaptive value in population are decoded to obtain the connection weight of perceptron neural network And bias;
Step S210, based on connection weight and bias determine webpage to be processed whether based on first language text.
S102 to step S110 through the above steps, obtain training sample in there are the text attributes of the webpage of first language Value based on the adaptive value of population at individual in the determination perceptron neural network, and then determines the connection weight of perceptron neural network Value and bias, so that the perceptron neural network can be passed through in pending web page text (new web page text) Connection weight and bias determine the new web page text whether based on first language text, it is seen then that for web page body The determination of text does not need to be determined according to prior setting parameter, but determines webpage by trained perceptron neural network Main text, so that solving the parameter in the related technology for extracting webpage is rule of thumb and the characteristics of structure of web page It is set in advance, therefore can reach and mention due to the problem of the inaccuracy of the improper extraction for leading to web page text of parameter setting The effect of high user experience.
It, can be with it should be noted that the first language being related in the present embodiment can be Chinese, Korean, Japanese etc. It is configured according to the needs of users.
In the optional embodiment of the present embodiment, for being obtained in training sample in the present embodiment step 202, there are the The mode of the text attribute value of the webpage of one language, can be achieved in that in the present embodiment
Step S202-1: obtain there are the accounting of first language in the webpage of first language, the character amount of first language, with And there are total character amounts of the webpage of first language;
Step S202-2: the mean value of accounting, the variance of accounting, the mean value of character amount, word are determined according to accounting and character amount The variance of symbol amount;
Step S202-3: by the accounting of first language, the character amount of first language, there are the total of the webpage of first language Character amount, the mean value of accounting, the variance of accounting, the mean value of character amount, the variance of character amount are as the first parameter value;
Step S202-4: the second parameter value is determined based on the first parameter value.
Face is by taking first language is Chinese as an example, for above-mentioned steps S202-1 to step S202-4, under be illustrated, Step S202-1 to step S202-4 may include: in the optional embodiment of the present embodiment
Step 1, to number webpage, to each webpage according to the structure extraction of html each there are Chinese contents Label (being usually present in div tag), be put into label information list labellist=L (1), L (2) ... ..L (i) ... .L (num) } in, wherein num be number of labels, L (i)={ L (i, j) } be label information j=1,2;L (i, 1) storage Label substance, L (i, 2) storage whether based on text status word.
Step 2, Chinese accounting inta and existing Chinese character quantity (chinese Number) in each label are calculated;
Step 3, according to the coding of Chinese character, the Chinese character quantity CN (ki) in L (ki) is counted, in entire label The text character quantity AN (ki) of appearance;
To calculate Chinese accounting inta (ki) value of L (ki), calculation formula are as follows:
Inta (ki)=CN (ki)/AN (ki)
Step 4, according to Chinese accounting inta (ki) value and Chinese character quantity CN (ki), text attribute value power is calculated (ki);
Calculation are as follows: first inta and CN is normalized, specific formula is as follows:
Norinta (i)=(inta (i)-intamean)/stdinta
NorCN (i)=(CN (i)-CNmean)/stdCN
Power (ki)=Norinta (i) * NorCN (i)
Wherein: intamean indicates the mean value of inta, and stdinta indicates the variance of inta, and CNmean indicates the mean value of CN, The variance of stdCN expression CN;
Step 5, vector={ intamean, stdinta, CNmean, stdCN, the AN for obtaining each label information (ki), CN (ki), power (ki), L (ki, 2) } eight parameters.
Wherein, the mean value of all label Chinese accountings of intamean, the variance of all label Chinese accountings of stdinta, The average value of all label Chinese character quantity of CNmean;The variance of all label Chinese character quantity of stdCNCNmean, CN (ki) Chinese character quantity, the text character quantity AN (ki) of entire label substance, L (ki, 2) storage whether based on text shape State word.
In another optional embodiment of the present embodiment, step step S202 can be generated to step by the following method For determining the connection weight of perceptron neural network and the population of bias in rapid S210, the step of this method, includes:
Step 10, the first population P that individual quantity is Popsize is randomly generatedt, wherein it is every in the first population An individual is all stored with DIM parameters of design to be optimized;
Wherein,Subscript i=1,2 ..., Popsize, andFor population PtIn I-th of individual;
Wherein, the formula of random initializtion are as follows:
Wherein, subscript j=1 ..., DIM, rand (0,1) are to obey equally distributed random real number between [0,1] to produce Raw function.
Step 11, the value that the first counter ki is arranged is N;
Step 12, a selective factor B ches is generated at random, if ches is greater than preset backward learning factor OBL, Step 13 is executed, it is no to then follow the steps 17;
Step 13, if the value N of the first counter ki is greater than the first population individual amount Popsize, the first population is For determining the connection weight of perceptron neural network and the population of bias, i.e. triggering executes individual in determining first population Adaptive value, and be arranged Evaluation: Current number for last evaluation number and Popsize's and, it is no to then follow the steps 14;
Step 14, based on the first population and preset hybrid rate Cr and pre-set zoom factor F obtain in the first population IndividualCorresponding experimental subjects
Step 15, experiment with computing individualAdaptive valueAnd it from the first individual and is tested according to preset rules Selection enters the second population in individual;
Wherein, preset rules are as follows:
Step 16, the value of the first counter ki is that N adds 1 sum, and executes step 12;
Step 17, minimum value of first population on specified dimension j is determinedMaximum valueAnd mean value
Wherein, minimum value of first population on specified dimension j is determined by following formulaMaximum valueWith And mean value
Step 18, it is based on minimum valueMaximum valueAnd mean valueObtain third population BPt
Wherein, it is based on minimum value in the following mannerMaximum valueAnd mean valueObtain third population
Step 19, adaptive value individual in third population is determined, according to adaptive value individual in the first population and the third Individual adaptive value in group selects Popsize outstanding individuals, from the first population and third population to replace the first population In all individuals;
Step 20, it repeats step 10 to 20 and reaches preset value until evaluating number, after evaluation number reaches preset value, step Replaced first population is for determining the connection weight of perceptron neural network and the population of bias in rapid 19.
Wherein, based on the first population and preset hybrid rate Cr and pre-set zoom factor F obtain in the first population The corresponding experimental subjects of individual include:
Step 20, the value that the second counter mj is arranged is M;
Step 21, a positive integer jRand between 1 and DIM is obtained at random, and between 1 and Popsize of random acquisition Two unequal the positive integer RI3, RI4 generated;
Step 23,15 are thened follow the steps if the value of the second counter mj is greater than DIM, it is no to then follow the steps 24;
Step 24, random to obtain the random number R 5 generated between 0 and 1, if R5 is individual less than in the first population Default hybrid rate Cr, jRand are equal to the value of the second counter mj, then follow the steps 25, no to then follow the steps 26;
Step 25,Step 27 is executed later;
Step 26,
Step 27, the value of the second counter mj adds 1, and goes to step 23.
In addition, to determining that the mode of adaptive value may is that in this present embodiment
The input variable of first parameter value perceptually device neural network is used to indicate webpage and is by step S204 to determine No is to include: in a manner of the third parameter value of text based on first language
Step 31, third counter Tm=1 is set;
Step 32, by the input variable of the first parameter value perceptually device neural network obtain being used to indicate webpage whether be The third parameter value outL of text based on first language;
Step S206 determines the adaptation of population at individual in perceptron neural network according to the second parameter value and third parameter value The mode of value includes:
Step 33, the square-error SN of the second parameter value L Yu third parameter value outL are calculated in the following mannerTm
SNTm=(outL-L)2
Step 34, third counter Tm=Tm+1;
Step 35, if the data volume that the value of third counter is greater than in training sample for being trained thens follow the steps 36, it is no to then follow the steps 32;
Step 36, error sum of squares of the perceptron neural network on training sample data collection is determinedAnd by with Under type determines individualAdaptive value:
The method and step in the present embodiment is described in detail below with reference to specific embodiment, the specific embodiment party Provide a kind of Chinese web page Text Extraction based on backward learning differential evolution in formula, the step of this method includes:
Step S302, according to 7 parameters in label as input, a parameter carries out text training as output;It needs It is noted that parameter is preferably 7 in the present embodiment, it also can according to need and other numbers be set.
Step S304 obtains neural network model according to text training result;
Step S306, new text determine affiliated text type according to trained neural network model.
It is comprised the following methods firstly, for the mode for carrying out text training in step S302:
Step S302-1, to number webpage, to each webpage according in each presence of the structure extraction of html The label (being usually present in div tag) of literary content, be put into label information list labellist=L (1), L (2), ... ..L (i) ... .L (num) in, wherein num be number of labels, L (i)={ L (i, j) } be label information j=1,2;L(i, 1) label substance, text status word based on L (i, 2) storage whether are stored.
Step S302-2 calculates Chinese accounting inta and existing Chinese character quantity (chinese in each label Number);
Step S302-3 counts the Chinese character quantity CN (ki) in L (ki) according to the coding of Chinese character, and entire The text character quantity AN (ki) of label substance;
To calculate Chinese accounting inta (ki) value of L (ki), calculation formula are as follows:
Inta (ki)=CN (ki)/AN (ki)
Step S302-4 calculates text attribute value according to Chinese accounting inta (ki) value and Chinese character quantity CN (ki) power(ki);
Calculation are as follows: first inta and CN is normalized, specific formula is as follows:
Norinta (i)=(inta (i)-intamean)/stdinta
NorCN (i)=(CN (i)-CNmean)/stdCN
Power (ki)=Norinta (i) * NorCN (i)
Wherein: intamean indicates the mean value of inta, and stdinta indicates the variance of inta, and CNmean indicates the mean value of CN, The variance of stdCN expression CN
Step S302-5, obtain each label information vector=intamean, stdinta, CNmean, stdCN, AN (ki), CN (ki), power (ki), L (ki, 2) } eight parameters.
The mean value of all label Chinese accountings of intamean, the variance of all label Chinese accountings of stdinta, CNmean institute There is the average value of label Chinese character quantity;The variance of all label Chinese character quantity of stdCN CNmean, CN (ki) Chinese Character quantity, the text character quantity AN (ki) of entire label substance, L (ki, 2) storage whether based on text status word.
The mode of training neural network model in step S304 is comprised the following methods:
Step S304-1 extracts training sample, the training dataset for being set as neural network for preceding 80%, wherein data volume It is set as test data set for TraNum group data, rear 20%, wherein data volume is TestNum group data;
Step S304-2, user's initiation parameter, Population Size Popsize, maximum evaluation number MAX_FEs, perceptron Backward learning factor OBL is arranged in the number HN of neural network hidden layer neuron;
Step S304-3, current evolution algebra t=0, hybrid rate Cr, zoom factor F, wherein subscript i=1,2 ..., Popsize, Evaluation: Current number FEs=0;
Step S304-4, enable perceptron neural network input variable be intamean, stdinta, CNmean, StdCN, AN (ki), CN (ki), power (ki) }, it exports as L (i, 2) (body tag), then determines perceptron neural network Hidden layer and output layer transmission function, and calculate perceptron optimal design parameter number DIM=HN × 8+1 (input become Number is measured multiplied by node number+bias node of hidden layer, then plus 1;8 is related to the number of input variable, if input variable is N is then n+1).
Step S304-5, is randomly generated initial populationWherein: subscript i=1, 2 ..., Popsize, andFor population PtIn i-th of individual, store DIM of the design to be optimized of perceptron and join Number, random initializtion formula are as follows:
Wherein subscript j=1 ..., DIM, rand (0,1) are to obey equally distributed random real number between [0,1] to generate Function;
Step S304-6 calculates population PtIn each individualAdaptive valueWherein subscript i=1,2 ..., Popsize;
Step S304-61, will be individualIt is decoded as the connection weight and bias of perceptron neural network;
Counter Tm=1 is arranged in step S304-62;
Step S304-63, using the Tm group data of neural metwork training data set as input, input variable is { intamean, stdinta, CNmean, stdCN, AN, CN, power } exports as L (body tag);
Step S304-64 calculates the outL value of every group of data output, really value L;
Step S304-65 calculates the outL value of every group of data output, the square-error SN of true value LTm, calculation It is as follows:
SNTm=(outL-L)2
Step S304-66, counter Tm=Tm+1;
Step S304-67 thens follow the steps 6.7 if Tm > TraNum, no to then follow the steps 6.3;
Step S304-68 calculates error sum of squares of the layered perception neural networks on sample data setThen individual is enabledAdaptive valueIt should be noted that aforesaid way is only the preferred embodiment in the present embodiment In.
Step S304-7 enables Evaluation: Current number FEs=FEs+Popsize, and saves population PtIn optimum individual Bestt(the maximum individual of Fit value is optimum individual);
Step S304-8 enables counter ki=1,
Step S304-9 is generated selective factor B Ches=rand (0,1), if Ches > OBL, thens follow the steps S304- 10, it is no to then follow the steps S304-17;
Step S304-10 thens follow the steps S304-16, otherwise holds if counter ki is greater than Population Size Popsize Row step S304-11;
Step S304-11, based on current population, hybrid rate Cr, zoom factor F generate individualTest Body, its step are as follows:
Step S304-111 enables counter mj=1;
A positive integer jRand is randomly generated in step S304-112 between [1, DIM];
Two unequal positive integer RI3, RI4 are randomly generated in step S304-113 between [1, Popsize];
Step S304-114 goes to step S304-12 if calculator mj is greater than DIM, no to then follow the steps S304- 115;
Step S304-115 generates a random number R 5 between [0,1], if R5 is less than individualCurrent hybridization Rate Cr or jRand are equal to counter mj, then follow the steps S304-116, no to then follow the steps S304-117;
Step S304-116,Execute step S304-118;
Step S304-117,
Step S304-118 enables counter mj=mj+1, goes to step S304-114;
Step S304-12 calculates test individualAdaptive value
Step S304-13, as follows in individualWith test individualBetween select individual and enter next-generation kind Group:
Step S304-14 enables counter ki=ki+1;
Step S304-15 goes to step S304-9;
Step S304-16 calculates population PtIn each individual adaptive value;Then Evaluation: Current number FEs=FEs+ Popsize saves population PtIn optimum individual Bestt, execute step S304-20;
Step S304-17 calculates in current population the minimum value in j dimensionWith maximum value It is equal Value, calculation formula are as follows:
Step S304-18 obtains backward learning population based on current populationGenerating mode is as follows:
Step S304-19 is calculatedAdaptive valueWherein i=1,2 ..., Popsize, according to adaptive value, choosing Take BPtWith PtMiddle Popsize excellent individual updates Pt, save population PtIn optimum individual Bestt
Step S304-20, current evolution algebra t=t+1;
Step S304-21 repeats step S304-8 to step S304-21 until Evaluation: Current number FEs reaches MAX_FEs After terminate, optimum individual Best obtained in implementation proceduretIt is decoded as the connection weight and bias of perceptron neural network, is instructed White silk terminates.
Include according to the mode of the determining affiliated text type of trained neural network model for text new in step S306 Following manner:
Each label information vector=intamean, stdinta, CNmean, stdCN, AN (ki), CN (ki), power(ki)};Input obtain L, so that it may judge the note whether based on, to extract web page body.
Through the above description of the embodiments, those skilled in the art can be understood that according to above-mentioned implementation The method of example can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but it is very much In the case of the former be more preferably embodiment.Based on this understanding, technical solution of the present invention is substantially in other words to existing The part that technology contributes can be embodied in the form of software products, which is stored in a storage In medium (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that a terminal device (can be mobile phone, calculate Machine, server or network equipment etc.) execute method described in each embodiment of the present invention.
Embodiment 2
A kind of processing unit of webpage is additionally provided in the present embodiment, and the device is for realizing above-described embodiment and preferably Embodiment, the descriptions that have already been made will not be repeated.As used below, predetermined function may be implemented in term " module " The combination of software and/or hardware.Although device described in following embodiment is preferably realized with software, hardware, or The realization of the combination of person's software and hardware is also that may and be contemplated.
Fig. 3 is the structural block diagram of the processing unit of webpage according to an embodiment of the present invention, as shown in figure 3, the device includes: First obtains module 402, for obtaining the text attribute value in training sample there are the webpage of first language, wherein text category Property value includes: to be used to indicate in webpage the first parameter value corresponding with first language, whether be used to indicate webpage with the first language Second parameter value of text based on speech;First determining module 404 is of coupled connections with the first acquisition module 402, is used for first Parameter value perceptually device neural network input variable determination be used to indicate whether webpage is the text based on first language Third parameter value;Second determining module 406 is of coupled connections with the first determining module 404, for according to the second parameter value and the Three parameter values determine the adaptive value of population at individual in perceptron neural network;Decoder module 408, with 406 coupling of the second determining module Connection is closed, for being decoded to obtain the connection weight of perceptron neural network and biasing to the optimal individual of adaptive value in population Value;Third determining module 410 is of coupled connections with decoder module 408, for determining net to be processed based on connection weight and bias Page whether based on first language text.
It should be noted that above-mentioned modules can be realized by software or hardware, for the latter, Ke Yitong Following manner realization is crossed, but not limited to this: above-mentioned module is respectively positioned in same processor;Alternatively, above-mentioned modules are with any Combined form is located in different processors.
The embodiments of the present invention also provide a kind of storage medium, computer program is stored in the storage medium, wherein The computer program is arranged to execute the step in any of the above-described embodiment of the method when operation.
Optionally, in the present embodiment, above-mentioned storage medium can be set to store by executing based on following steps Calculation machine program:
Step S1, by the first parameter value perceptually device neural network input variable with determination whether be used to indicate webpage For the third parameter value of the text based on first language;
Step S2 determines the adaptation of population at individual in perceptron neural network according to the second parameter value and third parameter value Value;
Step S3, the individual optimal to adaptive value in population be decoded to obtain perceptron neural network connection weight and Bias;
Step S4, based on connection weight and bias determine webpage to be processed whether based on first language text.
Optionally, in the present embodiment, above-mentioned storage medium can include but is not limited to: USB flash disk, read-only memory (Read- Only Memory, referred to as ROM), it is random access memory (Random Access Memory, referred to as RAM), mobile hard The various media that can store computer program such as disk, magnetic or disk.
The embodiments of the present invention also provide a kind of electronic device, including memory and processor, stored in the memory There is computer program, which is arranged to run computer program to execute the step in any of the above-described embodiment of the method Suddenly.
Optionally, above-mentioned electronic device can also include transmission device and input-output equipment, wherein the transmission device It is connected with above-mentioned processor, which connects with above-mentioned processor.
Optionally, in the present embodiment, above-mentioned processor can be set to execute following steps by computer program:
Step S1, by the first parameter value perceptually device neural network input variable with determination whether be used to indicate webpage For the third parameter value of the text based on first language;
Step S2 determines the adaptation of population at individual in perceptron neural network according to the second parameter value and third parameter value Value;
Step S3, the individual optimal to adaptive value in population be decoded to obtain perceptron neural network connection weight and Bias;
Step S4, based on connection weight and bias determine webpage to be processed whether based on first language text.
Optionally, the specific example in the present embodiment can be with reference to described in above-described embodiment and optional embodiment Example, details are not described herein for the present embodiment.
Obviously, those skilled in the art should be understood that each module of the above invention or each step can be with general Computing device realize that they can be concentrated on a single computing device, or be distributed in multiple computing devices and formed Network on, optionally, they can be realized with the program code that computing device can perform, it is thus possible to which they are stored It is performed by computing device in the storage device, and in some cases, it can be to be different from shown in sequence execution herein Out or description the step of, perhaps they are fabricated to each integrated circuit modules or by them multiple modules or Step is fabricated to single integrated circuit module to realize.In this way, the present invention is not limited to any specific hardware and softwares to combine.
The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.It is all within principle of the invention, it is made it is any modification, etc. With replacement, improvement etc., should all be included in the protection scope of the present invention.

Claims (9)

1. a kind of processing method of webpage characterized by comprising
There are the text attribute values of the webpage of first language in acquisition training sample, wherein the text attribute value includes: to be used for It indicates the first parameter value corresponding with the first language in the webpage, whether be used to indicate the webpage with first language Based on text the second parameter value;
By first parameter value perceptually device neural network input variable with determine be used to indicate the webpage whether be The third parameter value of text based on first language;
The adaptation of population at individual in the perceptron neural network is determined according to second parameter value and the third parameter value Value;
The individual optimal to adaptive value in the population is decoded to obtain the connection weight and partially of the perceptron neural network Set value;
Based on the connection weight and bias determine webpage to be processed whether based on first language text;
Wherein, step is generated for determining the connection weight of perceptron neural network and the population of bias by the following method:
Step 10, the first population P that individual quantity is Popsize is randomly generatedt, wherein it is each in first population Individual is all stored with DIM parameters of design to be optimized;
Wherein,Subscript i=1,2 ..., Popsize, andFor population PtIn i-th Individual;
Step 11, the value that the first counter ki is arranged is N;
Step 12, a selective factor B ches is generated at random, if ches is greater than preset backward learning factor OBL, is executed Step 13, no to then follow the steps 17;
Step 13, if the value N of the first counter ki is greater than the first population individual amount Popsize, it is described the first Group is for determining the connection weight of perceptron neural network and the population of bias, i.e., triggering, which executes, determines first population The adaptive value of middle individual, and be arranged Evaluation: Current number for last evaluation number and Popsize's and, it is no to then follow the steps 14;
Step 14, based on first population and preset hybrid rate Cr and pre-set zoom factor F obtain with it is described the first Individual in groupCorresponding experimental subjects
Step 15, the experimental subjects are calculatedAdaptive valueAnd according to preset rules from first population BodyWith the experimental subjectsMiddle selection enters the second population;
Step 16, the value of the first counter ki is that N adds 1 sum, and executes step 12;
Step 17, minimum value of first population on specified dimension j is determinedMaximum valueAnd mean value
Step 18, it is based on the minimum valueThe maximum valueAnd the mean valueObtain third population BPt
Step 19, adaptive value individual in the third population is determined, according to adaptive value individual in the first population and described Individual adaptive value in three modes selects Popsize outstanding individuals from first population and the third population, with Replace all individuals in first population;
Step 20, it repeats step 10 to 20 and reaches preset value until the evaluation number, reach preset value in the evaluation number Afterwards, replaced first population is for determining the connection weight of perceptron neural network and the population of bias in step 19.
2. the method according to claim 1, wherein obtaining in training sample, there are the texts of the webpage of first language This attribute value includes:
Obtain the accounting of first language described in webpage there are first language, the character amount of the first language and described There are total character amounts of the webpage of first language;
According to the accounting and the character amount determine the mean value of the accounting, the variance of the accounting, the character amount it is equal Value, the variance of the character amount;
By the accounting of the first language, the character amount of the first language, described there are total words of the webpage of first language Described in Fu Liang, the mean value of the accounting, the variance of the accounting, the mean value of the character amount, the variance of the character amount are used as First parameter value;
Second parameter value is determined based on first parameter value.
3. the method according to claim 1, wherein the population at individual on specified dimension jBy following random The formula of initialization obtains:
Wherein, subscript j=1 ..., DIM, rand (0,1) are to obey equally distributed random real number between [0,1] to generate letter Number.
4. the method according to claim 1, wherein being based on first population and preset hybrid rate Cr Obtaining experimental subjects corresponding with the individual in first population with pre-set zoom factor F includes:
Step 20, the value that the second counter mj is arranged is M;
Step 21, a positive integer jRand between 1 and DIM is obtained at random, and random obtain generates between 1 and Popsize Two unequal positive integer RI3, RI4;
Step 23,15 are thened follow the steps if the value of the second counter mj is greater than DIM, it is no to then follow the steps 24;
Step 24, random to obtain the random number R 5 generated between 0 and 1, if R5 is individual less than in first population Default hybrid rate Cr, jRand are equal to the value of the second counter mj, then follow the steps 25, no to then follow the steps 26;
Step 25,Step 27 is executed later;
Step 26,
Step 27, the value of the second counter mj adds 1, and goes to step 23.
5. method according to claim 1 or 4, which is characterized in that the preset rules are as follows:
6. method according to claim 1 or 2, which is characterized in that
By first parameter value perceptually device neural network input variable with determine be used to indicate the webpage whether be The third parameter value of text includes: based on first language
Step 31, third counter Tm=1 is set;
Step 32, by first parameter value perceptually the input variable of device neural network obtain being used to indicate the webpage be The no third parameter value outL for the text based on first language;
The adaptation of population at individual in the perceptron neural network is determined according to second parameter value and the third parameter value Value includes:
Step 33, the square-error SN of the second parameter value L Yu third parameter value outL are calculated in the following mannerTm
SNTm=(outL-L)2
Step 34, third counter Tm=Tm+1;
Step 35, if the data volume that the value of third counter is greater than in the training sample for being trained thens follow the steps 36, it is no to then follow the steps 32;
Step 36, error sum of squares of the perceptron neural network on training sample data collection is determinedAnd in the following manner Determine individualAdaptive value:
7. a kind of processing unit of webpage characterized by comprising
First obtains module, for obtaining the text attribute value in training sample there are the webpage of first language, wherein the text This attribute value include: be used to indicate in the webpage the first parameter value corresponding with the first language, be used to indicate it is described Webpage whether based on first language text the second parameter value;
First determining module, for by first parameter value perceptually device neural network input variable determination be used to indicate The webpage whether be the text based on first language third parameter value;
Second determining module, for determining the perceptron neural network according to second parameter value and the third parameter value The adaptive value of middle population at individual;
Decoder module, for being decoded to obtain the perceptron neural network to the optimal individual of adaptive value in the population Connection weight and bias;
Third determining module, for determining webpage to be processed whether based on first language based on the connection weight and bias Body text;
Wherein, the decoder module is also used to execute following methods step:
Step 10, the first population P that individual quantity is Popsize is randomly generatedt, wherein it is each in first population Individual is all stored with DIM parameters of design to be optimized;
Wherein,Subscript i=1,2 ..., Popsize, andFor population PtIn i-th Individual;
Step 11, the value that the first counter ki is arranged is N;
Step 12, a selective factor B ches is generated at random, if ches is greater than preset backward learning factor OBL, is executed Step 13, no to then follow the steps 17;
Step 13, if the value N of the first counter ki is greater than the first population individual amount Popsize, it is described the first Group is for determining the connection weight of perceptron neural network and the population of bias, i.e., triggering, which executes, determines first population The adaptive value of middle individual, and be arranged Evaluation: Current number for last evaluation number and Popsize's and, it is no to then follow the steps 14;
Step 14, based on first population and preset hybrid rate Cr and pre-set zoom factor F obtain with it is described the first Individual in groupCorresponding experimental subjects
Step 15, the experimental subjects are calculatedAdaptive valueAnd according to preset rules from first population BodyWith the experimental subjectsMiddle selection enters the second population;
Step 16, the value of the first counter ki is that N adds 1 sum, and executes step 12;
Step 17, minimum value of first population on specified dimension j is determinedMaximum valueAnd mean value
Step 18, it is based on the minimum valueThe maximum valueAnd the mean valueObtain third population BPt
Step 19, adaptive value individual in the third population is determined, according to adaptive value individual in the first population and described Individual adaptive value in three modes selects Popsize outstanding individuals from first population and the third population, with Replace all individuals in first population;
Step 20, it repeats step 10 to 20 and reaches preset value until the evaluation number, reach preset value in the evaluation number Afterwards, replaced first population is for determining the connection weight of perceptron neural network and the population of bias in step 19.
8. a kind of storage medium, which is characterized in that be stored with computer program in the storage medium, wherein the computer Program is arranged to execute method described in any one of claim 1 to 6 when operation.
9. a kind of electronic device, including memory and processor, which is characterized in that be stored with computer journey in the memory Sequence, the processor are arranged to run the computer program to execute side described in any one of claim 1 to 6 Method.
CN201810725034.5A 2018-07-04 2018-07-04 The processing method and processing device of webpage, storage medium, electronic device Active CN108984692B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810725034.5A CN108984692B (en) 2018-07-04 2018-07-04 The processing method and processing device of webpage, storage medium, electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810725034.5A CN108984692B (en) 2018-07-04 2018-07-04 The processing method and processing device of webpage, storage medium, electronic device

Publications (2)

Publication Number Publication Date
CN108984692A CN108984692A (en) 2018-12-11
CN108984692B true CN108984692B (en) 2019-06-21

Family

ID=64536213

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810725034.5A Active CN108984692B (en) 2018-07-04 2018-07-04 The processing method and processing device of webpage, storage medium, electronic device

Country Status (1)

Country Link
CN (1) CN108984692B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108984694B (en) * 2018-07-04 2019-07-30 龙马智芯(珠海横琴)科技有限公司 The processing method and processing device of webpage, storage medium, electronic device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020498A (en) * 2012-11-19 2013-04-03 广东亚仿科技股份有限公司 Intelligent dynamic access control method and system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103425765A (en) * 2013-08-06 2013-12-04 优视科技有限公司 Method and device for extracting webpage text and method and system for webpage preview
US10672025B2 (en) * 2016-03-08 2020-06-02 Oath Inc. System and method for traffic quality based pricing via deep neural language models
CN107358315A (en) * 2017-06-26 2017-11-17 深圳市金立通信设备有限公司 A kind of information forecasting method and terminal
CN108984694B (en) * 2018-07-04 2019-07-30 龙马智芯(珠海横琴)科技有限公司 The processing method and processing device of webpage, storage medium, electronic device

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020498A (en) * 2012-11-19 2013-04-03 广东亚仿科技股份有限公司 Intelligent dynamic access control method and system

Also Published As

Publication number Publication date
CN108984692A (en) 2018-12-11

Similar Documents

Publication Publication Date Title
CN103020845B (en) A kind of method for pushing and system of mobile application
CN106339507B (en) Streaming Media information push method and device
CN106897284A (en) The recommendation method and device of e-book
CN108366045A (en) A kind of setting method and device of air control scorecard
CN104992348B (en) A kind of method and apparatus of information displaying
CN103473036B (en) A kind of input method skin method for pushing and system
CN106874355A (en) The collaborative filtering method of social networks and user's similarity is incorporated simultaneously
CN110196904A (en) A kind of method, apparatus and computer readable storage medium obtaining recommendation information
CN109388715A (en) The analysis method and device of user data
CN108512883A (en) A kind of information-pushing method, device and readable medium
CN108304426A (en) The acquisition methods and device of mark
CN104679791B (en) Obtain the treating method and apparatus of data packet
CN106651580A (en) Method and device for judging whether financial account is malicious or not, and computing device
CN107807935B (en) Using recommended method and device
CN110392085A (en) Webpage pre-download method and device, storage medium and electronic device
CN108984692B (en) The processing method and processing device of webpage, storage medium, electronic device
CN109582967A (en) Public sentiment abstract extraction method, apparatus, equipment and computer readable storage medium
CN108511071A (en) Mental health evaluation method and device
CN108229640A (en) The method, apparatus and robot of emotion expression service
CN108401005A (en) A kind of expression recommendation method and apparatus
CN108984694B (en) The processing method and processing device of webpage, storage medium, electronic device
CN110162769A (en) Text subject output method and device, storage medium and electronic device
CN110472230A (en) The recognition methods of Chinese text and device
CN110489531A (en) The determination method and apparatus of high frequency problem
CN109033224A (en) A kind of Risk Text recognition methods and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP02 Change in the address of a patent holder
CP02 Change in the address of a patent holder

Address after: 519031 office 1316, No. 1, lianao Road, Hengqin new area, Zhuhai, Guangdong

Patentee after: LONGMA ZHIXIN (ZHUHAI HENGQIN) TECHNOLOGY Co.,Ltd.

Address before: 519000 room 417, building 20, creative Valley, Hengqin new area, Xiangzhou, Zhuhai, Guangdong

Patentee before: LONGMA ZHIXIN (ZHUHAI HENGQIN) TECHNOLOGY Co.,Ltd.