Embodiment
In order to realize the purpose of the application, the embodiment of the present application provides a kind of data processing method and equipment,
When getting pending business datum, the characteristic information included in the business datum, and utilization point are analyzed
Obtained characteristic information is analysed, it is determined that associating between the business datum got and other business datums
System, the characteristics of effectively prevent each business datum relative discrete in the prior art so that server being capable of root
The analysis to business datum is realized according to the incidence relation between different business data, business datum is effectively improved
Analysis efficiency, reach save system resource purpose.
With reference to Figure of description, to the application, each embodiment is described in further detail, it is clear that institute
The embodiment of description is only some embodiments of the present application, rather than whole embodiments.Based on the application
In embodiment, it is all that those of ordinary skill in the art are obtained under the premise of creative work is not made
Other embodiments, belong to the scope of the application protection.
A kind of schematic flow sheet for data processing method that Fig. 1 provides for the embodiment of the present application.Methods described can
With as follows.
Step 101:Obtain the first pending business datum and the second pending business datum.
In a step 101, multiple business datums are read from server (here with the first business datum and
Exemplified by two business datums), the first business datum and the second business datum read here can belong to same
The business datum of type of service or the business datum for belonging to different service types.
It should be noted that in the first business datum and the second business datum here " first " and " second "
There is no particular meaning, be used merely to distinguish two different business datums.
For example:Obtain a business datum a be:Zhang San, man, date of birth are 1970-1-1, Zhejiang
Hangzhou Designer, live in the building 402 of Hangzhou West Lake area WenXin Building 2;
Obtain a business datum b be:Li Si, man, the date of birth be 1973-2-1, Jinan, Shandong Province people,
Live in the building 402 of Hangzhou West Lake area WenXin Building 2;
Obtain a business datum c be:King five, 2014-1-1, from Hangzhou to Shanghai, take D1234;
Obtain a business datum d be:Zhang San, 2014-1-1, from Hangzhou to Beijing, take D1234.
From the business datum got, it can be found that business datum a and business datum b belongs to same industry
The business datum of service type;Business datum c and business datum d belongs to the business datum of same type of service;
Business datum a, business datum b and business datum c, business datum are belonging respectively to the industry of different service types
Business data.
Step 102:The fisrt feature information that first business datum is included, and analysis institute are analyzed respectively
State the second feature information that the second business datum is included.
Wherein, at least one in characteristics of objects, relationship characteristic and attributive character is included in the characteristic information
Kind.
In a step 102, for the first business datum got, extract first in first business datum
Comprising data content.
For example:If the first business datum is the business datum a in above-mentioned example, then what can be extracted should
The data content included in business datum a have " Zhang San ", " man ", " date of birth is 1970-1-1 ", " Zhejiang
River Hangzhou Designer ", " live in the building 402 " of Hangzhou West Lake area WenXin Building 2.
If the first business datum is the business datum d in above-mentioned example, then the business number that can be extracted
It is " Zhang San ", " 2014-1-1 ", " from Hangzhou to Beijing ", " seating D1234 " according to the data content in d.
Secondly, the corresponding characteristic type of each data content is analyzed.
Characteristic type includes object type, attribute type and relationship type in the embodiment of the present application.It is so right
As the corresponding feature of type is referred to as characteristics of objects, the corresponding feature of attribute type is referred to as attributive character, closes
The corresponding feature of set type is referred to as relationship characteristic.
Here object type can refer to objective things carrying out abstract obtained type, generally comprise entity
Subtype and event subtype.
For example:" Zhang San " represents a name in business datum a, and people can regard an entity as, then
" Zhang San " corresponding type can regard the entity subtype type in object type as;" Hangzhou West Lake area text is new
The building 402 " of mansion 2 represents a place name, and place name can also regard an entity as, then " Hangzhou West Lake area
The corresponding type in WenXin Building Building 2 402 " can regard the entity subtype type in object type as.
Again for example:" Zhang San " represents a name in business datum d, and people can regard an entity as, that
" Zhang San " corresponding type can regard the entity subtype type in object type as;" take D1234 "
An event is represented, event can also regard an entity as, then " taking the corresponding types of D1234 " can
To regard the event subtype in object type as.
Here attribute type can refer to the attribute of things, due to including entity subtype type in object type,
So some attributes can be typically corresponded to for entity subtype type, then corresponding attribute can be referred to as attribute
Type.
For example:The data content " man " that is included in business datum a, " date of birth is 1970-1-1 ", " Zhejiang
River Hangzhou Designer ", the sex of " man " mark one people, and sex can regard an attribute as, then " man "
Corresponding type can regard attribute type as;" date of birth is the birth of 1970-1-1 " mark one people
Date, and the date of birth can also regard an attribute as, then " date of birth is 1970-1-1 " correspondences
Type can regard attribute type as;The native place of " Zhejiang Hangzhou people " mark one people, and native place also may be used
To regard an attribute as, then " Zhejiang Hangzhou people " corresponding type can regard attribute type as.
Here relationship type can refer to the structural relation between different objects, and this structural relation can be
Incidence relation or dependence, are not limited here.
For example:In business datum a " Zhang San " and " between the building 402 " of Hangzhou West Lake area WenXin Building 2
Relation is " living in ", i.e., a kind of inhabitation relation, is represented by " Zhang San "-" inhabitation "-" Hangzhou West Lake
The building 402 " of area's WenXin Building 2;Relation in business datum d between " Zhang San " and " D1234 " is " to multiply
Sit ", i.e., a kind of seating relation is represented by " Zhang San "-" seating "-" D1234 ".
Finally, the characteristic type obtained according to analysis, determines the fisrt feature letter that first business datum is included
Breath.
Specifically, the feature corresponding to the data content included in the first business datum is can be found that by analysis
Type, and then can determine that the fisrt feature that first business datum is included is believed according to these characteristic types
Breath.
For example:The data content included in business datum a have " Zhang San ", " man ", " date of birth is
1970-1-1 ", " Zhejiang Hangzhou people ", " live in the building 402 " of Hangzhou West Lake area WenXin Building 2, institute is right respectively
The characteristic type answered is:Entity subtype type, attribute type, attribute type, attribute type, relationship type.
Then it is determined that the fisrt feature information included in business datum a has characteristics of objects, attributive character and relation special
Levy.
Alternatively, in the embodiment of the present application, the fisrt feature information that first business datum is included is analyzed
Mode can also include:
Determine the corresponding type of service of first business datum;
According to the mapping relations between type of service and Data Analysis Model, first business datum pair is determined
The corresponding Data Analysis Model of type of service answered, wherein, the Data Analysis Model is used to extract business number
According to characteristic information;
The fisrt feature information included in first business datum is analyzed using the Data Analysis Model.
Specifically, different business scenarios will produce different business datums, then for get first
Business datum, it may be determined that the corresponding type of service of the business datum;It is determined that first business datum correspondence
Type of service after, if pre-establishing the mapping relations between type of service and Data Analysis Model, then can
To determine the Data Analysis Model for being used to analyze first business datum correspondence type of service.
The first business datum can so be analyzed using obtained Data Analysis Model, extract this
The fisrt feature information included in one business datum.
The mapping relations set up between type of service and Data Analysis Model are illustrated how below.
Internet platform provides various business scenarios, the business number that each business scenario is gathered
According to difference, for example:Accounts information registers scene, and the business datum gathered is included:Name, sex, go out
Phase birthday, native place, residence etc. user essential information;Ticket buys scene, the business datum gathered
Comprising:Rider's name, rider's ID card No., departure place and destination, multiply information of vehicles and be equal to
Relevant information by bus.As can be seen here, the feature letter that the business datum that different business scene is gathered is included
Breath is also different.So for a type of service, following operation can be performed:
Obtain corresponding at least two business datum of the type of service;
Determine that the characteristics of objects, relationship characteristic and the attribute that are included in each described business datum are special respectively
Levy;
Utilize the characteristics of objects included in business datum each described, relationship characteristic and attributive character, structure
Build the corresponding Data Analysis Model of the type of service.
It should be noted that Data Analysis Model described in the embodiment of the present application can be in the form of instrument
Present, can also be presented, do not limited here in the form of database.
, can be by the first business when being analyzed using obtained Data Analysis Model the first business datum
Data input analysis tool, is extracted by analysis tool to the characteristic information included in the first business datum.
It can be used and the first business datum phase for the second business datum described in the embodiment of the present application
With analysis mode obtain the second feature information included in the second business datum, be not detailed herein the
The analysis method of the second feature information included in two business datums.
It should be noted that " first " in " fisrt feature information " and " second feature information " and " the
Two " without particular meaning, is used merely to represent the corresponding characteristic information of different business data.
Step 103:According to the fisrt feature information and the second feature information, first industry is determined
Incidence relation between data of being engaged in and second business datum.
In step 103, if including at least one in the fisrt feature information and the second feature information
Characteristics of objects, then according to the number of the same object feature included, determine first business datum with
Strength of association between second business datum;
If including at least one relationship characteristic in the fisrt feature information and the second feature information, then
According to the number of the identical relationship characteristic included, first business datum and the second business number are determined
Strength of association between;
If including at least one attributive character in the fisrt feature information and the second feature information, then
According to the number of the same alike result feature included, first business datum and the second business number are determined
Strength of association between, wherein, the more big corresponding strength of association of numerical value of the number is bigger.
For example:The multiple business datums got, pass through the characteristic information of above-mentioned each business datum of determination
Afterwards, it is found that the address information included in multiple business datums is identical, temporal information is identical, behavioral data
It is identical, then to can be inferred that between the user for producing this multiple business datum possess stronger incidence relation,
I.e. multiple users are possible to together go to perform a business, can apply in the business according to an offender
Data, other suspicion associated with the offender point are found by analyzing business datum associated with it
Son.
In another embodiment of the application, methods described also includes:
According to the incidence relation between first business datum and second business datum of determination, set up
Relational network figure comprising first business datum and second business datum.
For example:Obtain a business datum a be:Zhang San, man, date of birth are 1970-1-1, Zhejiang
Hangzhou Designer, live in the building 402 of Hangzhou West Lake area WenXin Building 2;
Obtain a business datum b be:Li Si, man, the date of birth be 1973-2-1, Jinan, Shandong Province people,
Live in the building 402 of Hangzhou West Lake area WenXin Building 2;
Obtain a business datum c be:King five, 2014-1-1, from Hangzhou to Shanghai, take D1234;
Obtain a business datum d be:Zhang San, 2014-1-1, from Hangzhou to Beijing, take D1234.
By using above-mentioned analysis mode, it is determined that obtaining the characteristic information of each business datum, then can
To determine the relational network figure between business datum a, business datum b, business datum c and business datum d,
As shown in Fig. 2 being the relational network figure of generation.
It can be seen directly that from Fig. 2, Zhang San took this coastiong of D1234 simultaneously with king five;Zhang San and Lee
Four all live in the building 402 of Hangzhou West Lake area WenXin Building 2.Relative to the Discrete service data stored in network
For, it more can intuitively find out the incidence relation between different business data.
In another embodiment of the application, the fisrt feature letter of first business datum is obtained in analysis
During breath, methods described also includes:
According to the type of characteristic information, classification storage first business datum, and set up the class of characteristic information
Mapping relations between type and first business datum.
Specifically, when obtaining the characteristic information of each business datum, it can be included according to business datum
Characteristic information, the mapping relations set up between business datum and characteristic information, for system, Ke Yigen
According to characteristic information, the business datum collected is analyzed, the corresponding business datum of same characteristic information
Belong to same class.
For example:It can be classified according to relationship characteristic, then above-mentioned business datum c and business datum d category
In the corresponding classification of same feature;It can classify according to characteristics of objects, then business business datum a and industry
Business data d belongs to the corresponding classification of same feature.
So, it is determined that when a characteristic type or a business datum, can be obtained by way of search
Get and meet this feature type or other business datums associated with the business datum, lift data analysis
Efficiency, and quickly determine the relevance between each business datum.
The technical scheme provided by the embodiment of the present application, obtains the first pending business datum and waits to locate
Second business datum of reason;The fisrt feature information that first business datum is included, Yi Jifen are analyzed respectively
Analyse in the second feature information that second business datum is included, the characteristic information comprising characteristics of objects, pass
It is at least one of feature and attributive character;Believed according to the fisrt feature information and the second feature
Breath, determines the incidence relation between first business datum and second business datum.So, obtaining
When getting pending business datum, the characteristic information included in the business datum is analyzed, and utilization is analyzed
The characteristic information arrived, it is determined that the incidence relation between the business datum and other business datums for getting, has
The characteristics of effect avoids each business datum relative discrete in the prior art so that server can be according to difference
Incidence relation between business datum realizes the analysis to business datum, effectively improves the analysis of business datum
Efficiency, reaches the purpose for saving system resource.
A kind of structural representation for data processing equipment that Fig. 3 provides for the embodiment of the present application.At the data
Reason equipment includes:Acquiring unit 31, analytic unit 32 and processing unit 33, wherein:
Acquiring unit 31, first business datum pending for obtaining and the second pending business datum;
Analytic unit 32, for analyzing the fisrt feature information that first business datum is included respectively, and
The second feature information that second business datum is included is analyzed, wherein, object is included in the characteristic information
At least one of feature, relationship characteristic and attributive character;
Processing unit 33, for according to the fisrt feature information and the second feature information, it is determined that described
Incidence relation between first business datum and second business datum.
In another embodiment of the application, the analytic unit 32 analyzes first business data packet
The fisrt feature information contained, including:
Determine the corresponding type of service of first business datum;
According to the mapping relations between type of service and Data Analysis Model, first business datum pair is determined
The corresponding Data Analysis Model of type of service answered, wherein, the Data Analysis Model is used to extract business number
According to characteristic information;
The fisrt feature information included in first business datum is analyzed using the Data Analysis Model.
In another embodiment of the application, the data processing equipment is also included:Unit 34 is set up, its
In, it is described to set up the mapping that unit 34 is set up between type of service and Data Analysis Model in the following manner
Relation:
For a kind of type of service, corresponding at least two business datum of the type of service is obtained;
Determine that the characteristics of objects, relationship characteristic and the attribute that are included in each described business datum are special respectively
Levy;
Utilize the characteristics of objects included in business datum each described, relationship characteristic and attributive character, structure
Build the corresponding Data Analysis Model of the type of service.
In another embodiment of the application, the processing unit 33 according to the fisrt feature information and
The second feature information, determines associating between first business datum and second business datum
System, including:
If including at least one characteristics of objects in the fisrt feature information and the second feature information, then
According to the number of the same object feature included, first business datum and the second business number are determined
Strength of association between;
If including at least one relationship characteristic in the fisrt feature information and the second feature information, then
According to the number of the identical relationship characteristic included, first business datum and the second business number are determined
Strength of association between;
If including at least one attributive character in the fisrt feature information and the second feature information, then
According to the number of the same alike result feature included, first business datum and the second business number are determined
Strength of association between, wherein, the more big corresponding strength of association of numerical value of the number is bigger.
It is described to set up unit 34 in another embodiment of the application, it is additionally operable to described according to determination
Incidence relation between one business datum and second business datum, sets up and includes first business datum
With the relational network figure of second business datum.
In another embodiment of the application, the data processing equipment also includes:Memory cell 35, its
In:
The memory cell 35, for analysis obtain the fisrt feature information of first business datum when,
According to the type of characteristic information, classification storage first business datum, and set up the type of characteristic information with
Mapping relations between first business datum.
It should be noted that the data processing equipment that the embodiment of the present application is provided can be real by software mode
It is existing, it can also be realized, do not limited here by hardware mode.Data processing equipment get it is pending
Business datum when, analyze the characteristic information included in the business datum, and the feature obtained using analysis is believed
Breath, it is determined that the incidence relation between the business datum and other business datums for getting, effectively prevent existing
The characteristics of having the business datum relative discrete of each in technology so that server can according to different business data it
Between incidence relation realize analysis to business datum, effectively improve the analysis efficiency of business datum, reach
Save the purpose of system resource.
It will be understood by those skilled in the art that embodiments herein can be provided as method, device (equipment),
Or computer program product.Therefore, the application can using complete hardware embodiment, complete software embodiment,
Or the form of the embodiment in terms of combination software and hardware.Moreover, the application can use it is one or more its
In include computer usable program code computer-usable storage medium (include but is not limited to disk storage
Device, CD-ROM, optical memory etc.) on the form of computer program product implemented.
The application is with reference to according to the method for the embodiment of the present application, device (equipment) and computer program product
Flow chart and/or block diagram describe.It should be understood that can by computer program instructions implementation process figure and/or
Each flow and/or square frame in block diagram and the flow in flow chart and/or block diagram and/or square frame
With reference to.These computer program instructions can be provided to all-purpose computer, special-purpose computer, Embedded Processor
Or the processor of other programmable data processing devices is to produce a machine so that by computer or other
The instruction of the computing device of programmable data processing device produce for realizing in one flow of flow chart or
The device for the function of being specified in one square frame of multiple flows and/or block diagram or multiple square frames.
These computer program instructions, which may be alternatively stored in, can guide computer or the processing of other programmable datas to set
In the standby computer-readable memory worked in a specific way so that be stored in the computer-readable memory
Instruction produce include the manufacture of command device, the command device realization in one flow or multiple of flow chart
The function of being specified in one square frame of flow and/or block diagram or multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices, made
Obtain and perform series of operation steps on computer or other programmable devices to produce computer implemented place
Reason, so that the instruction performed on computer or other programmable devices is provided for realizing in flow chart one
The step of function of being specified in flow or multiple flows and/or one square frame of block diagram or multiple square frames.
Although having been described for the preferred embodiment of the application, those skilled in the art once know base
This creative concept, then can make other change and modification to these embodiments.So, appended right will
Ask and be intended to be construed to include preferred embodiment and fall into having altered and changing for the application scope.
Obviously, those skilled in the art can carry out various changes and modification without departing from this Shen to the application
Spirit and scope please.So, if these modifications and variations of the application belong to the application claim and
Within the scope of its equivalent technologies, then the application is also intended to comprising including these changes and modification.