Detailed Description
The inventors have proceeded with long-term research and analysis of the present user data processing technology after discovering the technical problems as described in the background art. In the long-term research and analysis process, the inventor finds that the common user behavior processing technology is a batch and indiscriminate analysis of all the user data of the user service terminal, however, for all the user data of the user service terminal, a part of the data may be data irrelevant to the user behavior, that is, the part of the data is not analyzed and has no mining value. When the conventional user data processing technology is adopted to process the useless data, the processing resource is occupied, the analysis rate of the user portrait is reduced, and a plurality of noise data can be introduced, so that the analysis precision of the user portrait is influenced.
In view of the above, the present invention provides a big data-based user data processing method and a big data server, which can achieve targeted collection of user data before user portrait analysis, so as to filter out some useless data, and thus, when user portrait analysis is performed, user data with data analysis and mining values can be directly analyzed, which not only can improve the analysis rate of user portrait, but also can improve the analysis accuracy of user portrait, thereby improving the processing efficiency of user data.
In order to better understand the technical solutions, the technical solutions of the present application are described in detail below with reference to the drawings and specific embodiments, and it should be understood that the specific features in the embodiments and examples of the present application are detailed descriptions of the technical solutions of the present application, and are not limitations of the technical solutions of the present application, and the technical features in the embodiments and examples of the present application may be combined with each other without conflict.
In the following detailed description, numerous specific details are set forth by way of examples in order to provide a thorough understanding of the relevant guidance. It will be apparent, however, to one skilled in the art that the present application may be practiced without these specific details. In other instances, well-known methods, procedures, systems, compositions, and/or circuits have been described at a relatively high-level, without detail, in order to avoid unnecessarily obscuring aspects of the present application.
These and other features, functions, methods of execution, and combination of functions and elements of related elements in the structure and economies of manufacture disclosed in the present application may become more apparent upon consideration of the following description with reference to the accompanying drawings, all of which form a part of this application. It is to be expressly understood, however, that the drawings are for the purpose of illustration and description only and are not intended as a definition of the limits of the application. It should be understood that the drawings are not to scale. It is to be expressly understood, however, that the drawings are for the purpose of illustration and description only and are not intended as a definition of the limits of the application. It should be understood that the drawings are not to scale.
The present application uses flowcharts to illustrate the implementations performed by a system according to embodiments of the present application. It should be expressly understood that the processes performed by the flowcharts may be performed out of order. Rather, these implementations may be performed in the reverse order or simultaneously. In addition, at least one other implementation may be added to the flowchart. One or more implementations may be deleted from the flowchart.
Fig. 1 is a block diagram illustrating an exemplary big data based user data processing system 10 according to some embodiments of the present application, the big data based user data processing system 10 may include a big data server 100 and a user service terminal 200.
In some embodiments, as shown in FIG. 2, big data server 100 may include a processing engine 110, a network module 120, and a memory 130, processing engine 110 and memory 130 communicating through network module 120.
Processing engine 110 may process the relevant information and/or data to perform one or more of the functions described herein. For example, in some embodiments, processing engine 110 may include at least one processing engine (e.g., a single core processing engine or a multi-core processor). By way of example only, the Processing engine 110 may include a Central Processing Unit (CPU), an Application-Specific Integrated Circuit (ASIC), an Application-Specific Instruction Set Processor (ASIP), a Graphics Processing Unit (GPU), a Physical Processing Unit (PPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), a Programmable Logic Device (PLD), a controller, a microcontroller Unit, a Reduced Instruction Set Computer (RISC), a microprocessor, or the like, or any combination thereof.
Network module 120 may facilitate the exchange of information and/or data. In some embodiments, the network module 120 may be any type of wired or wireless network or combination thereof. Merely by way of example, the Network module 120 may include a cable Network, a wired Network, a fiber optic Network, a telecommunications Network, an intranet, the internet, a Local Area Network (LAN), a Wide Area Network (WAN), a Wireless Local Area Network (WLAN), a Metropolitan Area Network (MAN), a Public Switched Telephone Network (PSTN), a bluetooth Network, a Wireless personal Area Network, a Near Field Communication (NFC) Network, and the like, or any combination thereof. In some embodiments, the network module 120 may include at least one network access point. For example, the network module 120 may include wired or wireless network access points, such as base stations and/or network access points.
The Memory 130 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like. The memory 130 is used for storing a program, and the processing engine 110 executes the program after receiving the execution instruction.
It will be appreciated that the configuration shown in FIG. 2 is merely illustrative, and that the big data server 100 may also include more or fewer components than shown in FIG. 2, or have a different configuration than shown in FIG. 2. The components shown in fig. 2 may be implemented in hardware, software, or a combination thereof.
Fig. 3 is a flowchart illustrating an exemplary big data based user data processing method and/or process according to some embodiments of the present application, where the big data based user data processing method is applied to the big data server 100 in fig. 1, and may specifically include the following steps S31-S35.
And step S31, acquiring a service data analysis instruction for the user service terminal. For example, the user service terminal may be a smart device, including but not limited to a smart phone, various computer products, and a vehicle-mounted communication device. The business data analysis instructions may be initiated by a facilitator platform in communication with the big data server. Of course, the service data analysis instruction is only for non-privacy interaction services of the user service terminal, such as video viewing services authorized by the user service terminal, online shopping services, online forum services or government and enterprise service services.
Step S32, when it is determined that the user service terminal is in a service data interaction state based on the service data analysis instruction, determining a user data analysis policy based on the service data analysis instruction. For example, the service data interaction state may be used to characterize that the user service terminal is in a service interaction online state or a service interaction active state. In different service scenes, the service data interaction state can be different, in a video watching service scene, the service data interaction state can be a bullet screen input state of a user, and for an online shopping service, the service data interaction state can be a browsing state in which the user searches for commodities or a clicking state in which the commodities are purchased. The user data analysis strategy is used to provide instructive opinions on the analysis of user data. Further, a user data analysis policy may be formulated according to a service data analysis requirement carried in the service data analysis instruction, and the service data analysis requirement may include a requirement for acquiring different types of user figures, which is not described in detail herein.
Step S33, obtaining the service interaction object and the service interaction type data related to the service interaction time period, and obtaining the target service interaction object data based on the service interaction object related to the service interaction time period. For example, the service interaction period may be a period corresponding to a service data interaction state, for example, a period when a user performs barrage input, or a period when a user performs a browsing state of commodity search, which is not limited herein. The service interaction object may be other terminals with which the user service terminal has service interaction. The service interaction type data is used for representing different service interaction types, such as video barrage interaction, shopping interaction and the like mentioned above. The target service interaction object data is used for recording the relevant characteristic information of the service interaction object.
Step S34, determining a user data acquisition policy based on the analysis policy indication information of the user data analysis policy, the service interaction type data, and the target service interaction object data. For example, the analysis policy indication information may include data analysis logic algorithms or logic programming statements for different user data, and no further description is provided herein regarding the underlying logic algorithms and logic programming statements. The user data acquisition strategy is used for guiding the big data server to acquire targeted user data, and further guiding the big data server to acquire which types of user data and filter or discard which types of user data.
Step S35, collecting the user data to be processed from the user service terminal through the user data collection strategy, and analyzing the user portrait of the user data to be processed based on the user data analysis strategy to obtain the user portrait analysis result. For example, the user data to be processed is basically data with analysis value and mining value, so that the user data to be processed can be analyzed in a targeted manner when the user portrait is analyzed, the analysis rate of the user portrait can be increased, the analysis precision of the user portrait can be increased, and the processing efficiency of the user data can be improved.
To sum up, as described in steps S31-S35, before the user portrait analysis is performed, the user data analysis policy is determined based on the service data analysis instruction, and then the service interaction object and the service interaction type data related to the service interaction period are determined, so that the user data acquisition policy can be determined. Therefore, the big data server can collect the user data of the user service terminal in a targeted manner based on the user data collection strategy, so that useless data can be filtered out, and therefore when the user portrait is analyzed, the user data to be processed with data analysis and mining value can be directly analyzed, the analysis rate of the user portrait can be improved, the analysis precision of the user portrait can be improved, and the processing efficiency of the user data is improved. It can be understood that the method can combine the user data analysis strategy and the user data acquisition strategy, thereby improving the intelligent degree of user portrait analysis.
In the following, some alternative embodiments will be described, which should be understood as examples and not as technical features essential for implementing the present solution.
It is understood that, in a possible embodiment, in order to ensure real-time performance of the target service interaction object data, thereby improving timeliness of user portrait analysis and avoiding lag of user portrait analysis, in step S33, obtaining the target service interaction object data based on the service interaction object related to the service interaction period may include the following steps S331 and S332.
Step S331, performing service interaction behavior detection on the service interaction object related to the service interaction time period to obtain real-time service interaction object data corresponding to the service interaction behavior and interaction object change data of the real-time service interaction object data. For example, the service interaction behavior detection may be implemented by a preset detection thread, and the relevant configuration of the detection thread is the prior art and will not be further described herein.
Step S332, using the real-time service interaction object data corresponding to the service interaction behavior and the interaction object change data of the real-time service interaction object data as target service interaction object data.
Based on the above steps S331 and S332, the real-time service interaction object data and the interaction object change data of the real-time service interaction object data can be determined based on the preset detection thread, so that the real-time property of the target service interaction object data can be ensured, the timeliness of the user portrait analysis is improved, and the user portrait analysis is prevented from lagging.
In a further embodiment, the step S34 of determining the user data collection policy based on the analysis policy indication information of the user data analysis policy, the service interaction type data and the target service interaction object data may be implemented by the following step S340.
Step S340, sending the analysis policy indication information of the user data analysis policy, the service interaction type data, and the target service interaction object data to a preset collection policy generation model, and determining a user data collection policy based on the analysis policy indication information of the user data analysis policy, the service interaction type data, and the target service interaction object data in the preset collection policy generation model. For example, the preset acquisition strategy generation model may be a pre-established algorithm model, and the training samples and the testing samples of the model may be obtained according to the previous user portrait analysis record, which is not described herein again. By the design, the user data acquisition strategy can be determined based on the acquisition strategy generation model, so that the user data acquisition strategy is ensured to be matched with the actual user behavior.
It is understood that, on the basis of step S340, a user data collection policy is determined in a preset collection policy generation model based on the analysis policy indication information of the user data analysis policy, the service interaction type data, and the target service interaction object data, and further includes the following contents described in steps S341 to S344. The following different functional units of the acquisition policy generation model may be understood as different processing layers or different processing threads of the acquisition policy generation model, and related functions of these functional units may be adaptively adjusted through parameter adjustment, which is not further described herein.
Step S341, integrating the service interaction type data and the target service interaction object data into acquisition policy matching information by calling a data integration unit of the acquisition policy generation model; generating a page click analysis result corresponding to the analysis strategy indication information by calling an information processing unit of the acquisition strategy generation model, and generating a user behavior simulation result corresponding to the acquisition strategy matching information; the page click analysis result and the user behavior simulation result respectively comprise a plurality of user behavior events with different user interest heat values. For example, the user interest heat value is used for representing the correlation degree between different click events in the user behavior events, and the user behavior events comprise a plurality of different click events.
Step S342, extracting original user access trajectory information of any user behavior event of the analysis result clicked on the page by the analysis policy indication information, and determining the user behavior event with the minimum user interest heat value in the user behavior simulation result as a target user behavior event; mapping the original user access track information to the target user behavior event through the information processing unit, obtaining original access track mapping information in the target user behavior event, and generating an information association label set between the analysis strategy indication information and the acquisition strategy matching information according to the original user access track information and the original access track mapping information.
Step S343, using the original access track mapping information as reference information to obtain service interaction description information in the target user behavior event, mapping the service interaction description information to the user behavior event where the original user access track information is located according to the tag grouping result corresponding to the information associated tag set, obtaining the to-be-processed strategy matching information corresponding to the service interaction description information in the user behavior event where the original user access track information is located, and determining the target user access track information corresponding to the to-be-processed strategy matching information.
Step S344, obtaining an information mapping record of mapping the original user access track information to the target user behavior event; according to the information association degree between the strategy matching information to be processed and historical strategy matching information corresponding to a plurality of event records to be matched on the information mapping record, sequentially acquiring target click events corresponding to the target user access track information from the user behavior simulation result until the influence weight of the user behavior event where the target click event is located is consistent with the influence weight of the target user access track information in the page click analysis result, stopping acquiring the target click event in the next user behavior event, and establishing the data processing association relationship between the target user access track information and the last acquired target click event; and calling a strategy generation unit of the acquisition strategy generation model to extract information characteristics of the acquisition strategy matching information according to the data processing incidence relation, and generating the user data acquisition strategy according to an information characteristic extraction result. For example, the data processing association relationship may be used to record a corresponding relationship between the analysis policy indication information and the acquisition policy matching information, thereby implementing deep fusion of the user data analysis policy and the user data acquisition policy.
In this way, through the above steps S341 to S344, the correlation analysis of the analysis policy indication information, the service interaction type data, and the target service interaction object data can be implemented by calling different functional units of the acquisition policy generation model, so that the correlation between the user data analysis policy and the user data acquisition policy can be considered, thereby ensuring that the user data to be processed acquired through the user data acquisition policy can be highly matched with the user data analysis policy, and thus, the user portrait analysis result can be accurately obtained in real time.
On the basis of the above, in order to achieve targeted collection of user data to reduce or eliminate noise data as much as possible, the collection of user data to be processed from the user service terminal through the user data collection policy described in step S35 may further include the contents described in steps 3511-S3516.
Step S3511, a user data category set is determined according to data acquisition indication information in the user data acquisition strategy, wherein the user data category set comprises n user data categories, each user data category is provided with m data category labels, n is an integer larger than 1, and m is an integer larger than 1. For example, the data collection instruction information is used to instruct which user data needs to be collected, the user data category may be understood as a primary label, and the data category label may be understood as a secondary label.
Step S3512, a hot data category set is generated according to the user data category set, wherein the hot data category set comprises n hot data categories, each hot data category is obtained after screening the user data categories, and each hot data category is provided with m hot data category labels. For example, the hot data category is used to characterize more popular data categories, i.e., those corresponding to user data that has potential value.
Step S3513, aiming at a target hot data category label, determining a category label selection rate according to the hot data category set, wherein the target hot data category label belongs to any one hot data category label in the m hot data category labels. As the name implies, the category label selection rate is used to characterize the probability that a category label is selected.
Step S3514, aiming at the target hot data type label, if the target hot data type label meets the data hot evaluation condition, the target hot data type label corresponding to the type label selection rate is used as a type label to be screened. For example, the data heat evaluation condition may be designed according to actual requirements, and is not limited herein.
And S3515, repeating the step of determining the category labels to be screened until the m heat data labels are processed.
Step S3516, judging whether the number of the determined category labels to be screened exceeds a preset number; on the premise that the determined number of the category labels to be screened does not exceed the preset number, acquiring user data to be processed corresponding to the category labels to be screened from the user service terminal according to the category labels to be screened; and on the premise that the determined quantity of the category labels to be screened exceeds the preset quantity, sequencing the determined category labels to be screened according to the sequence of the selection rate of the category labels from high to low, selecting the preset quantity of category labels to be screened before sequencing as the category labels to be used, and acquiring the user data to be processed corresponding to the category labels to be used from the user service terminal according to the category labels to be used. For example, the preset data amount may be adjusted according to actual conditions, and is not limited herein.
It can be understood that according to the contents described in the above steps S3511-S3516, multiple levels of category labels can be determined, and the popularity data is taken into account, and then the category label selection rate is analyzed, so that the corresponding to-be-processed user data can be accurately collected from the user service terminal based on the selected to-be-used category label, and the to-be-processed user data is ensured to have potential mining and analyzing value, and the introduction of too much noise data is avoided as much as possible.
For some embodiments that may consider selection, the step S35 of performing user portrait analysis on the to-be-processed user data based on the user data analysis policy to obtain a user portrait analysis result may further include steps S3521-S3524.
Step S3521, according to the user data analysis strategy including the user portrait analysis index and the user data to be processed, obtaining data feature identification of each user behavior data feature used for user portrait feature comparison and global feature description confidence of global feature description information corresponding to portrait feature comparison items, wherein for any user behavior data feature, the data feature identification of the user behavior data feature is the local feature description confidence of local feature description information which can be matched with the user behavior data feature. For example, the user portrait analysis indicator is used to indicate the direction and emphasis of user portrait analysis. The user behavior data features may be represented in a feature vector or other forms, and are not limited herein. The description of the features can be understood as a visual description of the features, and the meanings of the related technical terms in the foregoing and the following can be reasonably deduced by those skilled in the art based on the contents provided in the present application in combination with the existing patent documents or forums, and will not be further described herein.
Step S3522, according to the data feature identification degree and the global feature description confidence of each user behavior data feature, local feature description information is allocated to each user behavior data feature, wherein each user behavior data feature is allocated with partial local feature description information of the global feature description information, and an information set of the local feature description information allocated to each user behavior data feature includes the global feature description information.
Step S3523, generating a data feature matching indication corresponding to each user behavior data feature according to the local feature description information allocated to each user behavior data feature, where the data feature matching indication corresponding to the user behavior data feature indicates the local feature description information allocated to the user behavior data feature for any user behavior data feature.
Step S3524, respectively executing data feature matching instructions corresponding to the user behavior data features to respectively match local feature description information distributed to the user behavior data features, and respectively comparing portrait analysis description information of a reference portrait analysis result with local feature description information matched with the user behavior data features on the basis of the user behavior data features to obtain a user portrait analysis result of the user service terminal. For example, the reference image analysis result is configured in advance, and can be flexibly configured according to actual requirements, which will not be further described herein.
Thus, through the steps S3521-S3524, the user behavior data features can be analyzed globally and locally, so that the differences and the relevance of the user image at the global level and the local level are considered, and the determined user image analysis result can reflect the actual image situation of the user from the actual level.
Further, in step S3522, the assigning local feature description information to each of the user behavior data features according to the data feature recognition degree and the global feature description confidence of each of the user behavior data features may include the following steps S35221 and S35222.
Step S35221, obtaining the user click frequency of each user behavior data feature, wherein the user click frequency represents the click event correlation degree of the user behavior data feature.
Step S35222, based on the global feature description confidence, the user click frequency of each user behavior data feature, and the data feature identification degree of each user behavior data feature, local feature description information is allocated to each user behavior data feature, where for any user behavior data feature, the local feature identification degree of the local feature description information allocated to the user behavior data feature is positively correlated with the user click frequency of the user behavior data feature, and the local feature identification degree of the local feature description information allocated to the user behavior data feature is not greater than the data feature identification degree of the user behavior data feature.
On the basis of the step S35222, the assigning local feature description information to each user behavior data feature based on the global feature description confidence, the user click frequency of each user behavior data feature, and the data feature identification degree of each user behavior data feature may exemplarily include the following steps a to d.
Step a, calculating a page click frequency mean value of user click frequencies of all unallocated user behavior data features, calculating a ratio of the user click frequency of all unallocated user behavior data features to the page click frequency mean value, and respectively obtaining an effective click event percentage of click event correlation degrees of all unallocated user behavior data features, wherein the unallocated user behavior data features are user behavior data features which are not allocated with local feature description information.
And b, respectively obtaining the local feature recognition degrees to be distributed of the user behavior data features according to the effective click event percentage of the click event correlation degree of the user behavior data features which are not distributed and the global feature description confidence degree, wherein the local feature recognition degrees to be distributed of the user behavior data features which are not distributed are positively correlated with the effective click event percentage of the click event correlation degree of the user behavior data features which are not distributed aiming at any user behavior data features which are not distributed.
And c, if the local feature identification degree to be distributed of each unallocated user behavior data feature is not greater than the data feature identification degree of the user behavior data feature, selecting local feature description information with the local feature identification degree to be distributed of the unallocated user behavior data feature from the unallocated local feature description information of the global feature description information aiming at any unallocated user behavior data feature, and distributing the local feature description information to the unallocated user behavior data feature, wherein the local feature description information distributed by each user behavior data feature does not have information intersection.
Step d, if target user behavior data characteristics exist, selecting local characteristic description information matched with the data characteristic identification degree of the target user behavior data characteristics from the unallocated local characteristic description information of the global characteristic description information aiming at any target user behavior data characteristics, allocating the local characteristic description information to the target user behavior data characteristics, updating the size of the global characteristic description confidence coefficient to the size of the local characteristic identification degree of the unallocated local characteristic description information in the current global characteristic description information, returning to the step for calculating the page click frequency mean value of the user click frequency of each unallocated user behavior data characteristic, calculating the ratio of the user click frequency of each unallocated user behavior data characteristic to the page click frequency mean value, and respectively obtaining the effective click event ratio of the click event correlation degree of each unallocated user behavior data characteristic to continue execution, the target user behavior data characteristics are the user behavior data characteristics of unallocated local feature description information, the local feature identification degree of the to-be-allocated user behavior data characteristics is greater than the data feature identification degree of the target user behavior data characteristics.
By the design, based on the steps a to d, when local feature description information is allocated to each user behavior data feature, the page click frequency mean of the user click frequency and the effective click event ratio of the click event relevance of each unallocated user behavior data feature are fully considered, and the user click frequency, the click event relevance and the corresponding effective click event ratio can reflect the relevance relationship between each user behavior data feature and the relevance relationship between the corresponding local feature description information in a numerical level, so that the distributed local feature description information can not have more errors and deletions.
In further embodiments, in addition to performing steps a-d above, the following embodiments may optionally be performed: one user behavior data characteristic corresponds to one data characteristic group, and the user click frequency and the data characteristic identification degree of each user behavior data characteristic in the same data characteristic group are the same. Based on this, on the basis of the step S35222, the assigning local feature description information to each user behavior data feature based on the global feature description confidence, the user click frequency of each user behavior data feature, and the data feature identification degree of each user behavior data feature may include the following steps S11 to S15.
Step S11, calculating a page click frequency average of user click frequencies of each unallocated user behavior data feature, where the unallocated user behavior data feature is a user behavior data feature to which local feature description information has not been allocated.
Step S12, calculating the ratio of the user click frequency of the single user behavior data characteristic in the data characteristic group to the page click frequency mean value aiming at any unallocated data characteristic group to obtain the effective click event ratio of the click event correlation degree of the single user behavior data characteristic in the data characteristic group, wherein the unallocated data characteristic group is the data characteristic group to which the user behavior data characteristic without distributed local characteristic description information belongs.
Step S13, aiming at any unallocated data characteristic group, obtaining the local feature recognition degree to be allocated of each user behavior data characteristic in the data characteristic group according to the effective click event percentage of the click event relevance degree of each user behavior data characteristic in the data characteristic group and the global feature description confidence degree, wherein aiming at any user behavior data characteristic, the local feature recognition degree to be allocated of each user behavior data characteristic is positively correlated with the effective click event percentage of the click event relevance degree of each user behavior data characteristic.
Step S14, if the local feature recognition degrees to be allocated corresponding to each unallocated data feature group are not greater than the data feature recognition degrees corresponding to the own data feature group, for any user behavior data feature of unallocated local feature description information, selecting local feature description information of the local feature recognition degree to be allocated to the user behavior data feature from the unallocated local feature description information of the global feature description information, and allocating the local feature description information to the user behavior data feature, where there is no information intersection in the local feature description information allocated to each user behavior data feature.
Step S15, if there is a target data feature group, for any target data feature group, in the unallocated local feature description information of the global feature description information, selecting the local feature description information with the data feature recognition degree corresponding to the target data feature group for each user behavior data feature in the target data feature group, allocating the local feature description information to each user behavior data feature in the target data feature group, updating the global feature description confidence degree to the local feature recognition degree of the unallocated local feature description information in the current global feature description information, returning to the above steps to calculate the page click frequency mean value of the user click frequency of each unallocated user behavior data feature, and continuing execution, wherein the target data feature group is the data feature group of the unallocated local feature description information with the corresponding to-be-allocated local feature recognition degree larger than the data feature recognition degree corresponding to the target data feature group .
It is understood that the steps S11-S15 are similar to the steps a-d, and therefore the steps a-d and the steps S11-S15 can be implemented by selecting any one of them, which is not limited herein.
In some other alternative embodiments, the comparing, based on each of the user behavior data features, the portrait analysis description information of the reference portrait analysis result with the local feature description information of each of the user behavior data features itself to obtain the user portrait analysis result of the user service terminal in step S3524 may include: based on the user behavior data characteristics, comparing portrait analysis description information of the reference portrait analysis result with local characteristic description information matched with the user behavior data characteristics in parallel; when a target analysis result with comparison timeliness weight larger than a preset timeliness weight threshold is obtained by comparison based on any user behavior data characteristic, finishing comparison of portrait analysis description information of each user behavior data characteristic aiming at the reference portrait analysis result; and determining a user portrait analysis result of portrait analysis description information of the reference portrait analysis result by comparing target analysis results with timeliness weights larger than a preset timeliness weight threshold. Therefore, when the user portrait analysis result is determined, the influence of comparison timeliness weight on the user portrait can be fully considered, so that the user portrait analysis result can timely reflect the actual situation of the user, timely portrait information guidance is provided for a service provider, and the service provider can conveniently push related service products timely.
In some other alternative embodiments, after the obtaining the data feature identification degree of each user behavior data feature used for user portrait feature comparison and the global feature description confidence degree of the global feature description information corresponding to the user portrait feature comparison in step S3521, the method further includes: calculating a confidence coefficient analysis result of the data feature recognition degree of each user behavior data feature to obtain a first recognition confidence coefficient; if the first recognition confidence is smaller than the global feature description confidence, deleting partial local feature description information from the global feature description information, so that the global feature description confidence of the deleted global feature description information is not larger than the first recognition confidence.
In some other alternative embodiments, the service processing thread of the user service terminal is preconfigured with an analysis instruction reporting sub-thread. Based on this, in step S31, the acquiring a service data analysis instruction for the user service terminal includes: and acquiring the service data analysis instruction reported by the analysis instruction reporting sub-thread. After the obtaining of the service data analysis instruction for the user service terminal in step S31, the method further includes: detecting a thread running label of a service processing thread based on the service data analysis instruction reported by the analysis instruction reporting sub-thread; and when detecting that the number of the thread running labels of the service processing thread is changed, determining that the user service terminal is in a service data interaction state.
In some other alternative embodiments, the service processing thread of the user service terminal is preconfigured with an interactive object identification sub-thread. Based on this, in step S31, the acquiring a service data analysis instruction for the user service terminal includes: and acquiring the current service interactive object of the service processing thread acquired by the interactive object identification sub-thread. After the obtaining of the service data analysis instruction for the user service terminal in step S31, the method further includes: acquiring an interactive object identification set based on the current service interactive object of the service processing thread acquired by the interactive object identification sub-thread; and when the updated record of the interactive object identification set appears in the thread running record corresponding to the service processing thread, determining that the user service terminal is in a service data interaction state.
In some other alternative embodiments, the service processing thread of the user service terminal is preconfigured with an interactive object identification sub-thread. Based on this, in step S31, the obtaining of the service data analysis instruction for the user service terminal includes: and acquiring the current service interactive object of the service processing thread of the user service terminal, which is acquired by the interactive object identification sub-thread. After the obtaining of the service data analysis instruction for the user service terminal in step S31, the method further includes: detecting an interaction state identifier in the current service interaction object; and determining whether the user service terminal is in a service data interaction state or not based on the detection result.
It is to be understood that the above manner of determining the service data interaction state may alternatively be used, and is not limited herein.
Fig. 4 is a block diagram illustrating an exemplary big data based user data processing apparatus 140 according to some embodiments of the present application, where the big data based user data processing apparatus 140 may include the following functional modules.
The analysis instruction obtaining module 141 is configured to obtain a service data analysis instruction for the user service terminal.
An analysis policy determining module 142, configured to determine a user data analysis policy based on the service data analysis instruction when it is determined that the user service terminal is in a service data interaction state based on the service data analysis instruction.
The object data obtaining module 143 is configured to obtain a service interaction object and service interaction type data related to a service interaction period, and obtain target service interaction object data based on the service interaction object related to the service interaction period.
A collection policy determining module 144, configured to determine a user data collection policy based on analysis policy indication information of the user data analysis policy, the service interaction type data, and the target service interaction object data.
And the user data processing module 145 is configured to acquire user data to be processed from the user service terminal through the user data acquisition policy, and perform user portrait analysis on the user data to be processed based on the user data analysis policy to obtain a user portrait analysis result.
For the above description of the analysis instruction obtaining module 141, the analysis policy determining module 142, the object data obtaining module 143, the collection policy determining module 144, and the user data processing module 145, please refer to the description of the corresponding method embodiments.
Based on the same or similar inventive concept, a system embodiment is also provided.
A big data-based user data processing system comprises a big data server, a user service terminal and a service provider platform, wherein the user service terminal and the service provider platform are communicated with the big data server;
the facilitator platform is to: sending a service data analysis instruction aiming at a user service terminal to a big data server;
the big data server is used for: acquiring a service data analysis instruction aiming at a user service terminal; when the user service terminal is determined to be in a service data interaction state based on the service data analysis instruction, determining a user data analysis strategy based on the service data analysis instruction; acquiring a service interaction object and service interaction type data related to a service interaction period, and acquiring target service interaction object data based on the service interaction object related to the service interaction period; determining a user data acquisition strategy based on analysis strategy indication information of the user data analysis strategy, the service interaction type data and the target service interaction object data; acquiring user data to be processed from the user service terminal through the user data acquisition strategy, and performing user portrait analysis on the user data to be processed based on the user data analysis strategy to obtain a user portrait analysis result; and feeding back the user portrait analysis result to the service provider platform.
The above description of the system embodiments may refer to the description of the corresponding method embodiments.
It should be understood that, for technical terms that are not noun-explained in the above, a person skilled in the art can deduce from the above disclosure to determine the meaning of the present invention unambiguously, for example, for some values, coefficients, weights, indexes, factors, confidence and other terms, a person skilled in the art can deduce and determine from the logical relationship between the above and below, and the value range of these values can be selected according to the actual situation, for example, 0 to 1, for example, 1 to 10, for example, 50 to 100, and is not limited herein.
The skilled person can unambiguously determine some preset, reference, predetermined, set and target technical features/terms, such as threshold values, threshold intervals, threshold ranges, etc., from the above disclosure. For some technical characteristic terms which are not explained, the technical solution can be clearly and completely implemented by those skilled in the art by reasonably and unambiguously deriving the technical solution based on the logical relations in the previous and following paragraphs. Prefixes of unexplained technical feature terms, such as "first", "second", "previous", "next", "current", "history", "latest", "best", "target", "specified", and "real-time", etc., can be unambiguously derived and determined from the context. Suffixes of technical feature terms not to be explained, such as "list", "feature", "sequence", "set", "matrix", "unit", "element", "track", and "list", etc., can also be derived and determined unambiguously from the foregoing and the following.
The above disclosure of the embodiments of the present application will be apparent to those skilled in the art from the above disclosure. It should be understood that the process of deriving and analyzing technical terms, which are not explained, by those skilled in the art based on the above disclosure is based on the contents described in the present application, and thus the above contents are not an inventive judgment of the overall scheme.
Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing detailed disclosure is to be considered merely illustrative and not restrictive of the broad application. Various modifications, improvements and adaptations to the present application may occur to those skilled in the art, although not explicitly described herein. Such modifications, improvements and adaptations are proposed in the present application and thus fall within the spirit and scope of the exemplary embodiments of the present application.
Also, this application uses specific terminology to describe embodiments of the application. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the present application is included in at least one embodiment of the present application. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various portions of this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of at least one embodiment of the present application may be combined as appropriate.
In addition, those skilled in the art will recognize that the various aspects of the application may be illustrated and described in terms of several patentable species or contexts, including any new and useful combination of procedures, machines, articles, or materials, or any new and useful modifications thereof. Accordingly, aspects of the present application may be embodied entirely in hardware, entirely in software (including firmware, resident software, micro-code, etc.) or in a combination of hardware and software. The above hardware or software may be referred to as a "unit", "component", or "system". Furthermore, aspects of the present application may be represented as a computer product, including computer readable program code, embodied in at least one computer readable medium.
A computer readable signal medium may comprise a propagated data signal with computer program code embodied therein, for example, on a baseband or as part of a carrier wave. The propagated signal may take any of a variety of forms, including electromagnetic, optical, and the like, or any suitable combination. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code on a computer readable signal medium may be propagated over any suitable medium, including radio, electrical cable, fiber optic cable, RF, or the like, or any combination of the preceding.
Computer program code required for the execution of aspects of the present application may be written in any combination of one or more programming languages, including object oriented programming, such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C + +, C #, VB.NET, Python, and the like, or similar conventional programming languages, such as the "C" programming language, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, dynamic programming languages, such as Python, Ruby, and Groovy, or other programming languages. The programming code may execute entirely on the user's computer, as a stand-alone software package, partly on the user's computer, partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any network format, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or in a cloud computing environment, or as a service, such as a software as a service (SaaS).
Additionally, the order of the process elements and sequences described herein, the use of numerical letters, or other designations are not intended to limit the order of the processes and methods unless otherwise indicated in the claims. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it should be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware means, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.
It should also be appreciated that in the foregoing description of embodiments of the present application, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of at least one embodiment of the invention. However, this method of disclosure is not intended to require more features than are expressly recited in the claims. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.