Embodiment
For making the object, technical solutions and advantages of the present invention clearly, below in conjunction with accompanying drawing, the present invention is described in further detail.
Find according to research, fraudulent call, doubtful fraudulent call generally all have obvious feature difference, such as, fraudulent call has the calling of busy high frequency, called subscriber's Relatively centralized, the higher feature of spaced discrete degree call time, doubtful fraudulent call has high frequency calling, called subscriber's relative distribution, feature that calling circle registration dispersion higher, call time is higher, non-fraudulent call has low frequency calling and the time concentrates, and calling circle registration is lower, calling behavior is less, the feature of the basic noncall behavior of busy.Therefore, the present invention can adopt clustering algorithm, multiple characteristic index values according to calling numbers all in ticket writing carry out tagsort to calling number, the calling number with same or similar feature is assigned in one bunch, that is to say, whole user is divided into multiple bunches with obvious characteristic difference, then by and confirmed the Characteristic Contrast of fraudulent call, thus find and confirmed the immediate fraudulent call of fraudulent call feature bunch and more close doubtful fraudulent call bunch.For fraudulent call bunch and doubtful fraudulent call bunch, the present invention adopts logistic regression algorithm precisely to identify fraudulent call wherein and doubtful fraudulent call more further, thus realizes accurate identification and the interception of fraudulent call in network-wide basis.
As shown in Figure 1, the method for a kind of real-time blocking fraudulent call based on clustering algorithm of the present invention, includes:
Step one, according to gathered ticket writing, calculate all calling numbers several characteristic index values within the certain hour cycle, then adopt clustering algorithm all calling numbers to be divided in three bunches, thus make the calling number in each bunch have identical or close characteristic index value;
Step 2, the characteristic index value of calling number in the characteristic index value respectively with three bunches confirming to swindle number to be mated, if the interval that characteristic index value is formed is more close, illustrate that matching similarity is higher, finally the highest for wherein matching similarity bunch is set to fraudulent call bunch, matching similarity takes second place bunch is set to doubtful fraudulent call bunch;
Because fraudulent call, doubtful fraudulent call have same or analogous feature, multiple characteristic index having notable difference can be chosen, by constantly trying out and verifying discovery, the present invention can choose following characteristic index and effectively identify fraudulent call and non-fraudulent call: the calling frequency, called number number, poor, the frequent called number number of calls of separation standard call time, the most high call period, call out same called number number of times maximum, call out the Second Largest Value of same called number number of times, call out the third-largest value of same called number number of times.Judge above-mentioned multiple characteristic index value whether with confirm that the characteristic index value of fraudulent call is in identical or close interval range, when characteristic index value is more close, then illustrate that matching similarity is higher.Meanwhile, by the calling number in three bunches and can also confirm that swindle number is compared, thus count in three bunches the number having confirmed to swindle number.Finally, consider from the matching similarity of multiple characteristic index value, the many factors such as number that confirms to swindle number, from three bunches, select a fraudulent call bunch and a doubtful fraudulent call bunch;
Step 3, adopt logistic regression algorithm, calculate the suspicious degree index of swindle of each calling number in swindle number bunch or doubtful swindle number bunch respectively:
wherein, z
ijbe i-th calling number in bunch j, j=1 or 2, bunch 1 is swindle number bunch, and bunches 2 is doubtful swindle numbers bunch, Y (z
ij) be calling number z
ijswindle characteristic value,
n is characteristic index number, α
jtthe weight coefficient of the characteristic index t in bunch j,
calling number z
ijthe value of characteristic index t, β
jbe the maximum likelihood estimation of bunch j, then judge that the swindle of calling number suspicious degree index is greater than the threshold value swindling suspicious degree index? if so, then illustrate that this calling number is fraudulent call or doubtful fraudulent call; If not, then illustrate that this calling number is not swindle number or doubtful swindle number, the swindle number that belongs to from calling number bunch or doubtful swindle number bunch, delete described calling number;
The threshold value of the suspicious degree index of described swindle is interval [0,1) real number between, its value can be established according to actual conditions, when swindling suspicious degree index and being larger, calling number is that the possibility of fraudulent call/doubtful fraudulent call is also larger, the threshold value such as swindling suspicious degree index is set to 0.9, when the suspicious degree index of swindle of calling number is more than or equal to 0.9, then determines that this calling number is fraudulent call/or doubtful fraudulent call; For α
jt, β
jvalue, can confirm that fraudulent call and non-fraudulent call are used as sample from Extraction parts swindle number bunch or doubtful swindle number bunch, and to α
jt, β
jarrange initial value, whether the suspicious degree exponential sum of swindle then calling number each in sample calculated is that the actual conditions of fraudulent call contrast, then to α
jt, β
jvalue repeatedly adjust, thus make the suspicious degree index of swindle calculated according to sample meet system actual needs, such as, after constantly adjusting, the weight system of characteristic index " the calling frequency " is set to-0.6626, the weight system of characteristic index " called number number " is set to 0.004633, the weight system of characteristic index " call time, separation standard was poor " is set to-0.001043, the weight system of characteristic index " the frequent called number number of calls " is set to 0.351, and maximum likelihood estimation is set to-6.189;
Step 4, all calling numbers in swindle number bunch and doubtful swindle number bunch to be updated in evidence obtaining directory and interception directory respectively.That is, write in evidence obtaining directory by the calling number in swindle number bunch, the calling number in doubtful swindle number bunch is write in interception directory.
As shown in Figure 2, step one can further include:
Step 11, calculate all calling numbers several characteristic index values within the certain hour cycle, and build characteristic of correspondence index set respectively for all calling numbers: X
i=(x
i1, x
i2..., x
iN), wherein X
icalling number z
icharacteristic index collection, x
i1, x
i2... x
iNcalling number z respectively
iseveral characteristic index values, N is characteristic index number;
Such as, following characteristic index can be chosen: the calling frequency, called number number, poor, the frequent called number number of calls of separation standard call time, the most high call period, call out same called number number of times maximum, call out the Second Largest Value of same called number number of times, call out the third-largest value of same called number number of times, N=8;
Step 12, to build three bunches (such as bunch 1, bunches 2, bunches 3), and by all calling number random division in three bunches, what wherein each calling number was unique belongs to one bunch;
Step 13, calculate the characteristic index central value collection C of each bunch
j:
wherein C
jthe characteristic index central value collection of bunch j, j=1,2 or 3,
c
jin the central value of characteristic index t, t is a natural number between 1 to N, and
i is 1 to M
jbetween a natural number, M
jthe calling number number in bunch j,
the calling number z in bunch j
ijthe value of characteristic index t;
Step 14, calculate all calling numbers square error and:
and do you judge that E is less than or equal to the threshold value of E? if so, then this flow process terminates; If not, then calculate the distance between each calling number and the characteristic index central value collection of all bunches again, and therefrom select the minimum value of distance, corresponding to the minimum value then calling number being repartitioned distance bunch in, wherein calling number z
iwith the computing formula of distance between the characteristic index central value collection of bunch j is as follows:
x
itcalling number z
ithe value of characteristic index t, then turn to step 13, wherein, the threshold value of E is the number between 0 to 1, and its value can set according to actual conditions, such as 2.71828
-5.
For the fraudulent call in evidence obtaining directory and interception directory and doubtful fraudulent call, the present invention can also implement recording evidence obtaining and real-time blocking means, to realize effective control of fraudulent call respectively to it.As shown in Figure 3, when a user initiates a call, the present invention also includes:
Client-initiated calling is toggled to SCP by steps A 1, caller MSC, does SCP judge that the calling number of described call request is in evidence obtaining directory or interception directory? if, then return call proceeding CONTINUE message to caller MSC, evidence obtaining routing number or interception routing number information is carried in described call proceeding message, and indicate caller MSC calling to be continued to be toggled to anti-swindle platform, then continue next step; If not, then perform original operation flow, this flow process terminates;
When calling number is when collecting evidence in directory, then carrying evidence obtaining routing number in call proceeding message, when calling number is when tackling in directory, then carrying interception routing number in call proceeding message;
When steps A 2, anti-swindle platform receive the call request that caller MSC sends, do you judge in call request, to carry evidence obtaining routing number? if, then bridge joint is carried out to the voice channel in call request between calling and called, then unidirectional recording is carried out to caller voice, generate a recording file, then be saved in naturetone Sample Storehouse or repeat tone Sample Storehouse by described recording file, this flow process terminates; If not, then next step is continued;
Steps A 3, does anti-swindle platform judge to carry interception routing number in call request? if, then to main in call request, voice channel between called carries out bridge joint, then unidirectional recording is carried out to caller voice, recording S generates a recording file after second, then by recording file one by one with repeat tone Sample Storehouse, all swindle samples comparison one by one in naturetone Sample Storehouse, when recording file and swindle sample are same voice, then illustrate that described recording file is fraudulent call, instruction called MS C interrupts main, voice channel between called, when recording file and all swindle samples are not same voice, then illustrate that recording file is not fraudulent call, continue to perform original operation flow.
By the voice channel between bridge joint calling and called, the speech data between calling and called all will transmit through anti-swindle platform, because the voice of callee side then can form interference, so the present invention only carries out unidirectional recording to caller voice to caller voice.In steps A 2, can adopt manual type to recording file come audition screen, if in recording file be the fraudulent call that true man speak, then using recording file as swindle Sample preservation in naturetone Sample Storehouse; If in recording file be the fraudulent call of machine playback, then using recording file as swindle Sample preservation in repeat tone Sample Storehouse, so get off, along with being on the increase of swindle sample, the information of naturetone Sample Storehouse or repeat tone Sample Storehouse can be more and more abundanter, also can be more and more higher to the recognition correct rate of fraudulent call.In steps A 3, the value of S can set according to actual needs, to meet doubtful fraudulent call in communication process by Real time identification and interception.
In Fig. 3 steps A 3, by recording file one by one with all swindle samples comparison one by one in repeat tone Sample Storehouse, naturetone Sample Storehouse, can further include: first by the swindle sample comparison one by one in recording file and repeat tone Sample Storehouse, when all swindle samples in recording file and repeat tone Sample Storehouse are not same voice, then by the swindle sample comparison one by one in recording file and naturetone Sample Storehouse.
As shown in Figure 4, by the swindle sample comparison one by one in recording file and repeat tone Sample Storehouse, can further include:
Steps A 31, build a temporal characteristics value collection for recording file: from the voice starting point of recording file, be a frame with n second, from recording file, order extracts G W frame voice messaging one by one, and utilize speech terminals detection technology, calculate the frame number between efficient voice starting point to end point in each W frame voice messaging, described frame number is designated as the temporal characteristics value of described W frame voice messaging, then the temporal characteristics value that the G calculated a temporal characteristics value is saved in recording file according to the precedence of recording file is concentrated;
The two-door limit value decision method of short-time energy and zero-crossing rate can be adopted to detect voice starting point and end point, to reject the interference of call clear band; The value of n, G, W can set according to actual needs, such as n=10ms, G=100, W=5.By repeatedly testing discovery, the shortest voice length setting has good implementation result in more than 10s the present invention, i.e. G >=100, W=5;
Steps A 32, build an energy eigenvalue collection for recording file: from the voice starting point of recording file, be a frame with n second, from recording file or swindle sample, order extracts G*W frame voice messaging one by one, and calculate the short-time energy value of each frame voice messaging, described short-time energy value is designated as the energy eigenvalue of every frame voice messaging, then the energy eigenvalue that a described G*W energy eigenvalue is saved in recording file according to the precedence of recording file is concentrated;
Steps A 33, the temporal characteristics value collection reading a swindle sample from repeat tone Sample Storehouse and energy eigenvalue collection;
In repeat tone Sample Storehouse, the temporal characteristics value collection of each swindle sample is identical with the construction method of energy eigenvalue collection with the temporal characteristics value collection of recording file with the construction method of energy eigenvalue collection, does not repeat at this;
Steps A 34, recording file and swindle sample temporal characteristics value are separately concentrated the temporal characteristics value comparison being one by one in identical sorting position, thus the identical several TS of temporal characteristics value that the temporal characteristics value calculating recording file and swindle sample is concentrated;
Steps A 35, respectively from recording file and swindle sample energy eigenvalue concentrate extract before K energy eigenvalue, the value of K can set according to actual needs, such as K=5;
The energy multiplication factor of steps A 36, calculating swindle sample and recording file:
wherein, YE
bb the energy eigenvalue that the energy eigenvalue of swindle sample is concentrated, GE
bb the energy eigenvalue that the energy eigenvalue of recording file is concentrated;
Steps A 37, according to energy multiplication factor B, each energy eigenvalue that the energy eigenvalue of recording file is concentrated to be adjusted: GE
b=B × GE
b, wherein, b is the natural number between 1 to G*W;
Steps A 38, the energy eigenvalue of recording file and swindle sample is concentrated the energy eigenvalue comparison being one by one in identical sorting position, thus the identical several ES of energy eigenvalue that the energy eigenvalue calculating recording file and swindle sample is concentrated;
The swindle voice confidence level of steps A 39, calculating recording file and swindle sample:
wherein, F is the weight coefficient of confidence level, and do you judge that the swindle voice confidence level of recording file and swindle sample is greater than the threshold value CC of swindle voice confidence level? if, then represent that recording file and swindle sample are same voice, namely the caller incoming call that recording file is corresponding can be judged as fraudulent call, and this flow process terminates; If not, then represent that recording file and swindle sample are not same voice, continue from repeat tone Sample Storehouse, read next swindle sample temporal characteristics value collection and energy eigenvalue collection, then turn to steps A 34; Wherein, the value of the threshold value CC of F, swindle voice confidence level can be arranged according to actual conditions, such as, and F=0.5, CC=90%.
The comparison of the swindle sample in recording file and naturetone Sample Storehouse can be realized by the speaker Recognition Technology (abbreviation speaker Recognition Technology) that text is irrelevant.Speaker Recognition Technology is essentially the problem of a pattern matching, general principle is that the voice of target speaker to be identified are carried out feature extraction and pattern drill, the aspect of model obtained is mated with the aspect of model in naturetone Sample Storehouse, then judges which speaker in most likely naturetone Sample Storehouse according to the similarity of coupling.Feature extracting method relatively more conventional at present has based on linear predictive coding (Linear Predictive Coding, LPC) the general coefficient of linear prediction (Linear Predictive Cepstrum Coefficients, LPCC), based on the Mel frequency cepstral coefficient (Mel-scale Frequency Cepstral Coefficients, MFCC) of voice principle and acoustical principles; Common method for mode matching has based on dynamic time warping (dynamic time warping, DTW), vector quantization (VectorQuantization, VQ), hidden Markov model (Hidden Markov Model, and the template matching method etc. of gauss hybrid models (GaussianMixture Model, GMM) HMM).
Adopt different Characteristic Extraction and method for mode matching, quantification and the step identified are not quite similar, and are not described in detail here.Have data to show, use the speaker Recognition Technology based on GMM, Gaussian Mixture degree be 32, in the sufficient situation of training data, accuracy rate can reach 98%.
As shown in Figure 5, the system of a kind of real-time blocking fraudulent call based on clustering algorithm of the present invention, includes anti-swindle platform, service control point (SCP) and moving exchanging center MSC, wherein:
Caller MSC, for when receiving Client-initiated calling, being toggled to SCP by described calling, then according to the instruction of SCP, continuing calling to be toggled to anti-swindle platform;
SCP, for when receiving caller MSC and forwarding the user's call request come, judge whether the calling number of described call request is collecting evidence in directory or interception directory, if, then return call proceeding CONTINUE message to caller MSC, carry evidence obtaining routing number or interception routing number information in described call proceeding message, and indicate caller MSC calling to be continued to be toggled to anti-swindle platform; If not, then perform original operation flow, wherein, when calling number is when collecting evidence in directory, then carrying evidence obtaining routing number in call proceeding message, when calling number is when tackling in directory, then carrying interception routing number in call proceeding message;
Anti-swindle platform can further include:
Cluster analyzing device, for according to gathered ticket writing, calculate all calling numbers several characteristic index values within the certain hour cycle, then clustering algorithm is adopted to be divided in three bunches by all calling numbers, thus make the calling number in each bunch have identical or close characteristic index value, again the characteristic index value of calling number in the characteristic index value respectively with three bunches confirming to swindle number is mated, if the interval that characteristic index value is formed is more close, illustrate that matching similarity is higher, finally be set to fraudulent call bunch by the highest for wherein matching similarity bunch, what matching similarity took second place bunch is set to doubtful fraudulent call bunch,
Logistic regression device, for adopting logistic regression algorithm, calculates the suspicious degree index of swindle of each calling number in swindle number bunch or doubtful swindle number bunch respectively:
wherein, z
ijbe i-th calling number in bunch j, j=1 or 2, bunch 1 is swindle number bunch, and bunches 2 is doubtful swindle numbers bunch, Y (z
ij) be calling number z
ijswindle characteristic value,
n is characteristic index number, α
jtthe weight coefficient of the characteristic index t in bunch j,
calling number z
ijthe value of characteristic index t, β
jbe the maximum likelihood estimation of bunch j, then judge whether the swindle of calling number suspicious degree index is greater than the threshold value swindling suspicious degree index, if so, then illustrate that this calling number is fraudulent call or doubtful fraudulent call; If not, then illustrate that this calling number is not swindle number or doubtful swindle number, the swindle number that belongs to from calling number bunch or doubtful swindle number bunch, delete described calling number;
Directory updating device, for being updated in evidence obtaining directory and interception directory respectively by all calling numbers in swindle number bunch and doubtful swindle number bunch;
Calling retransmission unit, during for receiving call request that caller MSC sends, judges whether carry evidence obtaining routing number in call request or tackle routing number, if carry evidence obtaining routing number, then notice recording apparatus for obtaining evidence, if carry interception routing number, then notice swindle blocking apparatus;
Recording apparatus for obtaining evidence, for carrying out bridge joint to the voice channel in call request between calling and called, then carrying out unidirectional recording to caller voice, generating a recording file, and be saved in by described recording file in naturetone Sample Storehouse or repeat tone Sample Storehouse;
Swindle blocking apparatus, for carrying out bridge joint to the voice channel in call request between calling and called, then unidirectional recording is carried out to caller voice, recording S generates a recording file after second, again by recording file one by one with all swindle samples comparison one by one in repeat tone Sample Storehouse, naturetone Sample Storehouse, when recording file and swindle sample are same voice, illustrate that recording file is fraudulent call, then indicate the voice channel between called MS C interruption calling and called.
As shown in Figure 6, cluster analyzing device can further include:
Characteristic index construction unit, for calculating all calling numbers several characteristic index values within the certain hour cycle, and builds characteristic of correspondence index set for all calling numbers: X respectively
i=(x
i1, x
i2..., x
iN), wherein X
icalling number z
icharacteristic index collection, x
i1, x
i2... x
iNcalling number z respectively
iseveral characteristic index values, N is characteristic index number;
Bunch build initialization unit, for building three bunches: bunch 1, bunches 2 and bunches 3, and by all calling number random division in three bunches, what wherein each calling number was unique belongs to one bunch;
Bunch center calculation unit, for calculating the characteristic index central value collection C of each bunch
j:
wherein C
jthe characteristic index central value collection of bunch j, j=1,2 or 3,
c
jin the central value of characteristic index t, t is a natural number between 1 to N, and
i is 1 to M
jbetween a natural number, M
jthe calling number number in bunch j,
the calling number z in bunch j
ijthe value of characteristic index t, then notify bunch adjustment unit calculate all calling numbers square error and;
Bunch adjustment unit, for calculate all calling numbers square error and:
and judge whether E is less than or equal to the threshold value of E, if not, then calculate the distance between each calling number and the characteristic index central value collection of all bunches again, and therefrom select the minimum value of distance, then corresponding to minimum value calling number being repartitioned distance bunch in, wherein calling number z
iwith the computing formula of distance between the characteristic index central value collection of bunch j is as follows:
x
itcalling number z
ithe value of characteristic index t, finally notify that bunch center calculation unit recalculates the characteristic index central value collection of each bunch, wherein, the threshold value of E is the number between 0 to 1, and its value can set according to actual conditions, such as 2.71828
-5.
As shown in Figure 7, swindle blocking apparatus can further include:
Voice recording unit, for receiving the call request that caller sends, the voice channel then between bridge joint calling and called, and after voice channel between calling and called sets up, carry out unidirectional recording to caller voice, recording S generates a recording file after second;
Repeat tone recognition unit, for by the swindle sample comparison one by one in recording file and repeat tone Sample Storehouse, to identify whether the swindle sample in recording file and repeat tone Sample Storehouse is same voice;
Naturetone recognition unit, for by the swindle sample comparison one by one in recording file and naturetone Sample Storehouse, to identify whether the swindle sample in recording file and naturetone Sample Storehouse is same voice.
As shown in Figure 8, repeat tone recognition unit can further include:
Temporal characteristics builds parts, for being recording file, or each swindle sample builds respective temporal characteristics value collection in repeat tone Sample Storehouse: from recording file or swindle sample voice starting point, be a frame with n second, from recording file or swindle sample, order extracts G W frame voice messaging one by one, and utilize speech terminals detection technology, calculate the frame number between efficient voice starting point to end point in each W frame voice messaging, described frame number is designated as the temporal characteristics value of described W frame voice messaging, then the temporal characteristics value that the G calculated a temporal characteristics value is saved in recording file or swindle sample according to the precedence in recording file or swindle sample is concentrated, wherein, the two-door limit value decision method of short-time energy and zero-crossing rate can be adopted to detect voice starting point and end point, to reject the interference of call clear band,
Energy feature builds parts, for being recording file, or each swindle sample builds respective energy eigenvalue collection in repeat tone Sample Storehouse: from recording file or swindle sample voice starting point, be a frame with n second, one by one from recording file, or order extracts G*W frame voice messaging in swindle sample, and calculate the short-time energy value of each frame voice messaging, described short-time energy value is designated as the energy eigenvalue of every frame voice messaging, then by a described G*W energy eigenvalue according to recording file, or the precedence of swindle sample is saved in recording file, or the energy eigenvalue of swindle sample is concentrated,
Swindle confidence calculations parts, for reading temporal characteristics value collection and the energy eigenvalue collection of each swindle sample from repeat tone Sample Storehouse, and the temporal characteristics value collection of recording file and swindle sample is sent to temporal characteristics identification component, the energy eigenvalue collection of recording file and swindle sample is sent to energy feature identification component simultaneously, then calculates the swindle voice confidence level of recording file and swindle sample:
wherein, F is the weight coefficient of confidence level, and judges whether the swindle voice confidence level of recording file and swindle sample is greater than threshold value CC, if so, then represents that recording file and swindle sample are same voice; If not, then represent that recording file and swindle sample are not same voice;
Temporal characteristics identification component, for recording file and swindle sample temporal characteristics value are separately concentrated the temporal characteristics value comparison being one by one in identical sorting position, thus calculate the recording file temporal characteristics value identical several TS concentrated with the temporal characteristics value of swindle sample;
Energy feature identification component, extracting front K energy eigenvalue for concentrating from recording file and swindle sample energy eigenvalue separately, then calculating the energy multiplication factor of swindle sample and recording file:
wherein, YE
bb the energy eigenvalue that the energy eigenvalue of swindle sample is concentrated, GE
bbe b the energy eigenvalue that the energy eigenvalue of recording file is concentrated, then according to energy multiplication factor B, each energy eigenvalue that the energy eigenvalue of recording file is concentrated adjusted: GE
b=B × GE
bwherein, b is the natural number between 1 to G*W, finally the energy eigenvalue of recording file and swindle sample is concentrated the energy eigenvalue comparison being one by one in identical sorting position, thus calculates the recording file energy eigenvalue identical several ES concentrated with the energy eigenvalue of swindle sample.
Naturetone recognition unit can realize the comparison of the swindle sample in recording file and naturetone Sample Storehouse by the speaker Recognition Technology (abbreviation speaker Recognition Technology) that text is irrelevant.Speaker Recognition Technology is essentially the problem of a pattern matching, general principle is that the voice of target speaker to be identified are carried out feature extraction and pattern drill, the aspect of model obtained is mated with the aspect of model in naturetone Sample Storehouse, then judges which speaker in most likely naturetone Sample Storehouse according to the similarity of coupling.Feature extracting method relatively more conventional at present has based on linear predictive coding (Linear PredictiveCoding, LPC) the general coefficient of linear prediction (Linear Predictive Cepstrum Coefficients, LPCC), based on the Mel frequency cepstral coefficient (Mel-scale Frequency CepstralCoefficients, MFCC) of voice principle and acoustical principles; Common method for mode matching has based on dynamic time warping (dynamic timewarping, DTW), vector quantization (Vector Quantization, VQ), hidden Markov model (Hidden Markov Model, and the template matching method etc. of gauss hybrid models (Gaussian Mixture Model, GMM) HMM).
The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, within the spirit and principles in the present invention all, any amendment made, equivalent replacement, improvement etc., all should be included within the scope of protection of the invention.