CN111274462A - Data processing method and device - Google Patents
Data processing method and device Download PDFInfo
- Publication number
- CN111274462A CN111274462A CN202010049536.8A CN202010049536A CN111274462A CN 111274462 A CN111274462 A CN 111274462A CN 202010049536 A CN202010049536 A CN 202010049536A CN 111274462 A CN111274462 A CN 111274462A
- Authority
- CN
- China
- Prior art keywords
- behavior information
- behavior
- target object
- time periods
- vectors
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 10
- 239000013598 vector Substances 0.000 claims abstract description 76
- 238000000034 method Methods 0.000 claims abstract description 27
- 230000002650 habitual effect Effects 0.000 claims abstract description 13
- 230000003993 interaction Effects 0.000 claims description 28
- 230000009471 action Effects 0.000 description 37
- 238000010586 diagram Methods 0.000 description 10
- 230000002452 interceptive effect Effects 0.000 description 8
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000005034 decoration Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 241000282414 Homo sapiens Species 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/906—Clustering; Classification
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a data processing method and device. Wherein, the method comprises the following steps: acquiring behavior information of a target object, and forming a plurality of behavior information strings based on the execution time of the behavior information; vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods; and determining the habitual behavior information of the target object in a plurality of different time periods by clustering the behavior vectors of the target object in the plurality of different time periods. The invention solves the technical problem that the behavior habit of the user is difficult to accurately analyze under the condition of small data volume in the prior art.
Description
Technical Field
The present invention relates to the field of data processing, and in particular, to a method and an apparatus for processing data.
Background
Over time, people are becoming increasingly aware of the importance of data. The big data era provides new challenges for the data handling capability of human beings and also provides unprecedented space and potential for people to obtain more profound and comprehensive insights. For example, the behavior habits of the user can be analyzed through big data analysis, which is usually performed based on the frequency of each behavior of the user at present, but the behavior habits obtained according to the frequency are not accurate for a single user or a group of users with less data.
Aiming at the problem that the behavior habit of a user is difficult to accurately analyze under the condition of small data volume in the prior art, an effective solution is not provided at present.
Disclosure of Invention
The embodiment of the invention provides a data processing method and device, and at least solves the technical problem that the behavior habit of a user is difficult to accurately analyze under the condition of small data volume in the prior art.
According to an aspect of an embodiment of the present invention, there is provided a data processing method, including: acquiring behavior information of a target object, and forming a plurality of behavior information strings based on the execution time of the behavior information; vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods; and determining the habitual behavior information of the target object in a plurality of different time periods by clustering the behavior vectors of the target object in the plurality of different time periods.
Further, acquiring behavior information of the target object, and constructing a plurality of behavior information strings based on the execution time of the behavior information, including: acquiring behavior information of a target object in a preset time range, wherein the preset time range comprises a plurality of time periods; connecting the behavior information belonging to the same time period according to the execution time of the behavior information to obtain a plurality of connection results; and segmenting the connection result to obtain the behavior information string in each time period.
Further, segmenting the connection result to obtain a behavior information string in each time period, including: acquiring a time difference between two adjacent behavior information; and if the time difference is greater than the preset time length, segmenting the two adjacent behavior information.
Further, before vectorization processing is carried out on the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods, a time period to which the execution time of the initial behavior information in the behavior information string belongs is determined as a time period to which the behavior information string belongs; and performing alignment completion processing on the behavior information strings in each time period to ensure that the behavior information strings in the same time period have the same dimensionality.
Further, vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods, including: acquiring behavior data corresponding to each behavior information in a behavior information string in each time period, wherein the behavior data comprises at least one of the following items: identification of the behavior information, interaction data and time interval with the next behavior information; and replacing the behavior information with the behavior data to form a behavior vector of the target object in a plurality of different time periods.
Further, determining the habitual behavior information of the target object in a plurality of different time periods by clustering the behavior vectors of the target object in the plurality of different time periods, including: clustering the behavior vectors in the same time period to obtain a clustering center corresponding to each time period; selecting a plurality of candidate vectors with the distance from the clustering center smaller than a preset distance; obtaining the mean value of the selected candidate vectors to obtain a mean value vector; and converting the behavior data represented by each item in the mean vector into behavior information to obtain the habitual behavior information of the target object in a plurality of different time periods.
Further, after acquiring the behavior information of the target object within the preset time range, the method further includes: and eliminating the behavior information with the occurrence frequency lower than the preset frequency.
Further, the behavior information includes voice interaction behavior information between the target object and the home appliance.
According to an aspect of an embodiment of the present invention, there is provided a data processing apparatus including: the acquisition module is used for acquiring the behavior information of the target object and forming a plurality of behavior information strings based on the execution time of the behavior information; the processing module is used for vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods; the determining module is used for determining the habitual behavior information of the target object in a plurality of different time periods by clustering the behavior vectors of the target object in the plurality of different time periods.
According to an aspect of the embodiments of the present invention, there is provided a storage medium including a stored program, wherein, when the program runs, a device on which the storage medium is located is controlled to execute the above-mentioned data processing method.
According to an aspect of the embodiments of the present invention, there is provided a processor, configured to execute a program, where the program executes the method for processing data described above.
In the embodiment of the invention, behavior information of a target object is obtained, and a plurality of behavior information strings are formed based on the execution time of the behavior information; vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods; and determining the habitual behavior information of the target object in a plurality of different time periods by clustering the behavior vectors of the target object in the plurality of different time periods. According to the scheme, under the condition that the data volume is limited, vectorization processing is carried out on the behavior information of different time periods, and clustering analysis is carried out on the behavior vectors of different time periods obtained through the vectorization processing, so that behavior habits of users in different time periods are obtained, and the problem that the behavior habits of the users are difficult to accurately analyze under the condition that the data volume is small in the prior art is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow chart of a method of processing data according to an embodiment of the invention;
FIG. 2 is a diagram illustrating an alignment completion of behavior information strings according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a first vectorization process for behavior information strings according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of determining a behavior information string according to an embodiment of the invention;
FIG. 5 is a schematic diagram of obtaining information about user behavior according to an embodiment of the present invention; and
fig. 6 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
In accordance with an embodiment of the present invention, there is provided an embodiment of a method for processing data, it being noted that the steps illustrated in the flowchart of the figure may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order different than that presented herein.
Fig. 1 is a flowchart of a data processing method according to an embodiment of the present invention, as shown in fig. 1, the method including the steps of:
step S102, behavior information of the target object is obtained, and a plurality of behavior information strings are formed based on the execution time of the behavior information.
Specifically, the target object is a user whose behavior needs to be analyzed, and may be a single user, or may be multiple users that allow the same home appliance to be operated together, for example, multiple persons in a home. The behavior information is an interactive behavior between the user and the home appliance device, for example, a control instruction sent to the home appliance device by the user may be a voice interactive behavior, a gesture interactive behavior, a remote control behavior, a trigger button behavior, and the like. And connecting the behavior information according to the execution time sequence to obtain the behavior information string.
In an optional embodiment, taking an air conditioner as an example, behavior information of a user operating the air conditioner is collected, and the behavior information of each day is connected according to the execution time sequence, so that a behavior information string corresponding to each day can be formed.
And step S104, vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods.
In an alternative implementation, one day may be taken as a cycle, and one day may be divided into 24 time periods, with each hour being taken as a time period.
First, the behavior information string may be mapped in a time period, for example, a time period to which the execution time of the first behavior information in the behavior information string belongs is determined to be the time period to which the behavior information string belongs. After the behavior information string contained in each time period is determined, vectorization processing is carried out on the behavior information string in each time period, and then the behavior vector in each time period can be obtained.
When vectorizing the behavior information string, the name of the behavior information may be converted into a word vector in a word2vec manner or the like, so as to obtain a behavior vector corresponding to the behavior information string, or the behavior information in the behavior information string may be represented by a corresponding preset numerical value, so as to form a behavior vector corresponding to the information string.
Step S160, determining the habitual behavior information of the target object in a plurality of different time periods by clustering the behavior vectors of the target object in a plurality of different time periods.
After determining the behavior vectors corresponding to different time periods, clustering the behavior vectors to obtain the habit behavior vectors corresponding to the habit behavior information of the user in each time period, and then converting the habit behavior vectors to obtain the habit behavior information corresponding to the habit behavior information of the user in each time period.
After the habit behavior information of the user in each time period is obtained, the household appliance can be intelligently controlled based on the habit behavior information. In an alternative embodiment, the habit behavior information of the user at 13:00 to 14:00 is obtained as follows: when the intelligent control system is started, the temperature is set to be 23 ℃, and wind is swept up and down, the user can automatically execute the action string at 13:00, or the user is prompted whether to execute the action string, so that the aim of intelligently controlling according to the habit of the user is fulfilled.
As can be seen from the above, in the embodiments of the present application, behavior information of a target object is obtained, and a plurality of behavior information strings are formed based on the execution time of the behavior information; vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods; and determining the habitual behavior information of the target object in a plurality of different time periods by clustering the behavior vectors of the target object in the plurality of different time periods. According to the scheme, under the condition that the data volume is limited, vectorization processing is carried out on the behavior information of different time periods, and clustering analysis is carried out on the behavior vectors of different time periods obtained through the vectorization processing, so that behavior habits of users in different time periods are obtained, and the problem that the behavior habits of the users are difficult to accurately analyze under the condition that the data volume is small in the prior art is solved.
As an alternative embodiment, acquiring behavior information of a target object, and forming a plurality of behavior information strings based on execution time of the behavior information includes: acquiring behavior information of a target object in a preset time range, wherein the preset time range comprises a plurality of time periods; connecting the behavior information belonging to the same time period according to the execution time of the behavior information to obtain a plurality of connection results; and segmenting the connection result to obtain the behavior information string in each time period.
Specifically, the preset time range may be one month, two months or the like closest to the current time, and the time period may be one day.
In an optional embodiment, taking an air conditioner as an example for explanation, all behavior information interacting with the air conditioner within one month closest to the current time may be obtained, the behavior information within each day is connected to obtain a connection result of the behavior information within each month, and then the connection result of each day is divided.
As an alternative embodiment, segmenting the connection result to obtain the behavior information string in each time period includes: acquiring a time difference between two adjacent behavior information; and if the time difference is greater than the preset time length, segmenting the two adjacent behavior information.
If the time difference between two adjacent behavior information is greater than the preset time length, the two adjacent behavior information may not belong to a group of operations, and thus needs to be divided.
In an optional embodiment, the preset time length may be 30 minutes, a day is used as a time period, and the connection result of the behavior information in a day is divided according to an interval exceeding 30 minutes, so as to obtain a plurality of behavior information strings.
As an optional embodiment, before performing vectorization processing on the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods, the method further includes: determining the time period to which the execution time of the initial behavior information in the behavior information string belongs as the time period to which the behavior information string belongs; and performing alignment completion processing on the behavior information strings in each time period to ensure that the behavior information strings in the same time period have the same dimensionality.
Specifically, the alignment completion is used to make the behavior information strings in the same time period have the same dimension, and the dimension that is the same is used to indicate that all the behavior information strings in the same time period are the same behavior information or null action at the same position.
In an alternative embodiment, still taking the time cycle as one day as an example, the one day is divided into 24 time periods on average, and the time period to which the execution time of the first behavior information in the behavior information string belongs is determined. For example, for the behavior information string "power on-set to 23 degrees celsius-wind up and down", the action of power on occurs between 13:00 and 14:00, and thus it is determined that the behavior information string is in the time period of 13:00 to 14: 00.
Because the behavior information contained in the behavior information strings in the same time period may be different, before vectorization processing is performed on the behavior information strings, alignment completion processing needs to be performed on the behavior information strings, so that the dimensions of the behavior information segments in the same time period are the same.
Fig. 2 is a schematic diagram of performing alignment completion on a behavior information string according to an embodiment of the present invention, and with reference to fig. 2, a behavior information string 1 is: action one, action three, action four, the behavior information string 2 is: action one, action three and action five, the action information string 1 and the action information string 2 are the same action, different parts are supplemented with null actions, thereby obtaining the result shown in fig. 2, after the Jining alignment is completed, the action information string 1 is: action one, action three, action four and no action, the action information string 2 is: action one, action three, null action and action five.
Fig. 2 only shows an example of performing alignment completion on two behavior information strings, and the number of behavior information strings included in the same time period is often greater than two, so when performing alignment completion, all behavior information strings in the same time period may be simultaneously performed alignment completion, or the longest behavior information string may be found from the behavior information strings, and the longest behavior information string is respectively performed alignment completion on other behavior information strings.
As an optional embodiment, performing vectorization processing on the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods includes: acquiring behavior data corresponding to each behavior information in a behavior information string in each time period, wherein the behavior data comprises at least one of the following items: identification of the behavior information, interaction data and time interval with the next behavior information; and replacing the behavior information with the behavior data to form a behavior vector of the target object in a plurality of different time periods.
Specifically, the behavior data corresponding to the behavior information may be one or more, and the behavior information is replaced with the corresponding behavior data, so that the behavior vector corresponding to the behavior information string is obtained.
The identifier of the behavior information may be a preset action number, for example, the identifier corresponds to a power-on operation 01, a temperature adjustment operation 02, a power-off operation 00, and the like; the interaction data is used to indicate the duration of the action or a specific interaction value of the action, for example, the action lasts for 1 hour, the interaction data is 1 hour, the temperature is adjusted to 23 degrees celsius, and the interaction data is 23 hours.
In an alternative embodiment, in the case that the behavior data includes the above three parameters, the behavior information string 1 after completion of alignment in fig. 2 is vectorized to obtain [ action one number, interaction data, time interval, action three number, interaction data, time interval, action four number, interaction data, time interval, null action number, interaction data ]. The interaction data of the null action is 0, and the time interval between the null action and the other actions is also 0.
Fig. 3 is a schematic diagram of a first vectorization processing on a behavior information string according to an embodiment of the present invention, and in conjunction with fig. 3, for each behavior information (i.e., the above-mentioned action) in the behavior information string, three values may be represented, which are an identifier of the behavior information, interaction data, and a time interval, where the interaction data may be an action duration or an interaction value.
As an alternative embodiment, determining the habit behavior information of the target object in a plurality of different time periods by clustering the behavior vectors of the target object in the plurality of different time periods includes: clustering the behavior vectors in the same time period to obtain a clustering center corresponding to each time period; selecting a plurality of candidate vectors with the distance from the clustering center smaller than a preset distance; obtaining the mean value of the selected candidate vectors to obtain a mean value vector; and converting the behavior data represented by each item in the mean vector into behavior information to obtain the habitual behavior information of the target object in a plurality of different time periods.
Specifically, when clustering is performed, a mean shift clustering method may be used to obtain a clustering center. And setting a preset radius by taking the clustering center as a circle center to obtain a circle with the clustering center as the circle center, collecting a plurality of points, namely the candidate vectors, from the circle, respectively calculating the mean value of each item in the candidate vectors to form a mean value vector, and converting the mean value vector into behavior information in the same conversion mode, so that the habit behavior information of the user in a plurality of time periods can be obtained.
In an optional implementation, mean shift clustering is performed on the behavior vectors in the same time period, the largest cluster point is selected, points within a certain distance from the example cluster points are selected as initial vectors, each item at the same position in the initial vectors is added, the number of nonzero elements at the position in all the initial vectors is divided, the mean vector is obtained, the mean vector is converted into a behavior information string, and the behavior information string is an interaction path frequently used by a user.
As an optional embodiment, after obtaining the behavior information of the target object within the preset time range, the method further includes: and eliminating the behavior information with the occurrence frequency lower than the preset frequency.
Specifically, the behavior information with the occurrence frequency lower than the preset frequency is usually behavior information that is not commonly used by the user, and has a low reference value and can increase the complexity of the operation.
Fig. 4 is a schematic diagram of determining a behavior information string according to an embodiment of the present invention, and with reference to fig. 4, an interactive action (i.e., the behavior information) is obtained, an interactive action with a frequency lower than a preset frequency is removed according to a frequency of the interactive action, a connection interactive action is configured according to a time period, and a connection result is divided according to a time interval to obtain a plurality of interactive paths (i.e., the behavior information string).
As an alternative embodiment, the behavior information includes voice interaction behavior information between the target object and the home appliance.
In the above scheme, the behavior information includes voice interaction behavior information between the target object and the home appliance, so that the finally obtained habit behavior information is voice habit behavior information of the user, and thereby the behavior habit of the user when performing voice interaction with the home appliance is obtained.
Fig. 5 is a schematic diagram of obtaining user habit information according to an embodiment of the present invention, and with reference to fig. 5, an interaction path (i.e., the behavior information string) is first segmented according to a time interval, the interaction path is aligned and completed, the interaction path is vectorized, mean shift clustering is performed on vectors obtained through the vectorization processing to obtain cluster points, points within a certain distance from the cluster points are selected, and vectors corresponding to the selected points are processed (the mean value of the vectors corresponding to the selected points is obtained and converted into corresponding interaction paths), so as to obtain interaction paths commonly used by users.
Example 2
According to an embodiment of the present invention, there is provided an embodiment of a data processing apparatus, and fig. 6 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention, as shown in fig. 6, the apparatus includes:
the obtaining module 60 is configured to obtain behavior information of the target object, and form a plurality of behavior information strings based on the execution time of the behavior information.
And the processing module 62 is configured to perform vectorization processing on the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods.
And the determining module 64 is configured to determine the habit behavior information of the target object in a plurality of different time periods by clustering the behavior vectors of the target object in the plurality of different time periods.
As an alternative embodiment, the obtaining module includes: the first obtaining sub-module is used for obtaining behavior information of a target object in a preset time range, wherein the preset time range comprises a plurality of time periods; the connection submodule is used for connecting the behavior information belonging to the same time period according to the execution time of the behavior information to obtain a plurality of connection results; and the division submodule is used for dividing the connection result to obtain the behavior information string in each time period.
As an alternative embodiment, the partitioning submodule includes: an acquisition unit configured to acquire a time difference between two adjacent pieces of behavior information; and the dividing unit is used for dividing the two adjacent behavior information if the time difference is greater than the preset time length.
As an alternative embodiment, the apparatus further comprises: the determining module is used for determining the time period to which the execution time of the initial behavior information in the behavior information string belongs as the time period to which the behavior information string belongs before vectorization processing is carried out on the behavior information string to obtain the behavior vectors of the target object in a plurality of different time periods; and the processing module is used for carrying out alignment completion processing on the behavior information strings in each time period so that the behavior information strings in the same time period have the same dimensionality.
As an alternative embodiment, the processing module comprises: the second obtaining submodule is configured to obtain behavior data corresponding to each piece of behavior information in the behavior information string in each time period, where the behavior data includes at least one of the following: identification of the behavior information, interaction data and time interval with the next behavior information; and the composition module is used for replacing the behavior information with the behavior data to form the behavior vector of the target object in a plurality of different time periods.
As an alternative embodiment, the determining module includes: the clustering submodule is used for clustering the behavior vectors in the same time period to obtain a clustering center corresponding to each time period; the selection submodule is used for selecting a plurality of candidate vectors of which the distances from the clustering center are smaller than the preset distance; the third obtaining submodule is used for obtaining the mean value of the selected candidate vectors to obtain a mean value vector; and the conversion sub-module is used for converting the behavior data represented by each item in the mean vector into behavior information to obtain the habitual behavior information of the target object in a plurality of different time periods.
As an alternative embodiment, the apparatus further comprises: and the rejecting module is used for rejecting the behavior information of which the occurrence frequency is lower than the preset frequency after acquiring the behavior information of the target object within the preset time range.
As an alternative embodiment, the behavior information includes voice interaction behavior information between the target object and the home appliance.
Example 3
According to an embodiment of the present invention, a storage medium is provided, and the storage medium includes a stored program, wherein when the program runs, a device in which the storage medium is located is controlled to execute the data processing method according to embodiment 1.
Example 4
According to an embodiment of the present invention, there is provided a processor, configured to execute a program, where the program executes the data processing method according to embodiment 1.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.
Claims (11)
1. A method for processing data, comprising:
acquiring behavior information of a target object, and forming a plurality of behavior information strings based on the execution time of the behavior information;
vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods;
and determining the habitual behavior information of the target object in the different time periods by clustering the behavior vectors of the target object in the different time periods.
2. The method of claim 1, wherein obtaining behavior information of a target object and constructing a plurality of behavior information strings based on execution time of the behavior information comprises:
acquiring behavior information of the target object in a preset time range, wherein the preset time range comprises a plurality of time periods;
connecting the behavior information belonging to the same time period according to the execution time of the behavior information to obtain a plurality of connection results;
and segmenting the connection result to obtain a behavior information string in each time period.
3. The method according to claim 2, wherein the segmenting the concatenation result to obtain the behavior information string in each time period comprises:
acquiring a time difference between two adjacent behavior information;
and if the time difference is greater than the preset time length, segmenting the two adjacent behavior information.
4. The method according to claim 1, wherein before vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods, the method further comprises:
determining the time period to which the execution time of the initial behavior information in the behavior information string belongs as the time period to which the behavior information string belongs;
and performing alignment completion processing on the behavior information strings in each time period to ensure that the behavior information strings in the same time period have the same dimensionality.
5. The method according to claim 4, wherein vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods comprises:
acquiring behavior data corresponding to each behavior information in a behavior information string in each time period, wherein the behavior data comprises at least one of the following items: the identification of the behavior information, the interaction data and the time interval with the next behavior information;
and replacing the behavior information with the behavior data to form a behavior vector of the target object in a plurality of different time periods.
6. The method of claim 1, wherein determining the habitual behavior information of the target object in the different time periods by clustering the behavior vectors of the target object in the different time periods comprises:
clustering the behavior vectors in the same time period to obtain a clustering center corresponding to each time period;
selecting a plurality of candidate vectors with the distance from the clustering center smaller than a preset distance;
obtaining the mean value of the selected candidate vectors to obtain a mean value vector;
and converting the behavior data represented by each item in the mean vector into behavior information to obtain the habitual behavior information of the target object in a plurality of different time periods.
7. The method according to claim 2, wherein after acquiring the behavior information of the target object within a preset time range, the method further comprises: and eliminating the behavior information with the occurrence frequency lower than the preset frequency.
8. The method according to claim 1, wherein the behavior information includes voice interaction behavior information between the target object and a home appliance.
9. An apparatus for processing data, comprising:
the acquisition module is used for acquiring the behavior information of the target object and forming a plurality of behavior information strings based on the execution time of the behavior information;
the processing module is used for vectorizing the behavior information string to obtain behavior vectors of the target object in a plurality of different time periods;
and the determining module is used for determining the habit behavior information of the target object in the different time periods by clustering the behavior vectors of the target object in the different time periods.
10. A storage medium, characterized in that the storage medium comprises a stored program, wherein when the program runs, a device where the storage medium is located is controlled to execute the data processing method of any one of claims 1 to 8.
11. A processor, characterized in that the processor is configured to run a program, wherein the program is configured to execute a method for processing data according to any one of claims 1 to 8 when the program is run.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010049536.8A CN111274462A (en) | 2020-01-16 | 2020-01-16 | Data processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010049536.8A CN111274462A (en) | 2020-01-16 | 2020-01-16 | Data processing method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111274462A true CN111274462A (en) | 2020-06-12 |
Family
ID=71000944
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010049536.8A Pending CN111274462A (en) | 2020-01-16 | 2020-01-16 | Data processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111274462A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112057079A (en) * | 2020-08-07 | 2020-12-11 | 中国科学院深圳先进技术研究院 | Behavior quantification method and terminal based on state and map |
CN112883257A (en) * | 2021-01-11 | 2021-06-01 | 北京达佳互联信息技术有限公司 | Behavior sequence data processing method and device, electronic equipment and storage medium |
WO2022027590A1 (en) * | 2020-08-07 | 2022-02-10 | 中国科学院深圳先进技术研究院 | State and graph-based behavior quantification method and terminal |
CN115204322A (en) * | 2022-09-16 | 2022-10-18 | 成都新希望金融信息有限公司 | Behavioral link abnormity identification method and device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107426177A (en) * | 2017-06-13 | 2017-12-01 | 努比亚技术有限公司 | A kind of user behavior clustering method and terminal, computer-readable recording medium |
CN108470034A (en) * | 2018-02-01 | 2018-08-31 | 百度在线网络技术(北京)有限公司 | A kind of smart machine service providing method and system |
CN109933502A (en) * | 2019-01-23 | 2019-06-25 | 平安科技(深圳)有限公司 | Electronic device, the processing method of user operation records and storage medium |
-
2020
- 2020-01-16 CN CN202010049536.8A patent/CN111274462A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107426177A (en) * | 2017-06-13 | 2017-12-01 | 努比亚技术有限公司 | A kind of user behavior clustering method and terminal, computer-readable recording medium |
CN108470034A (en) * | 2018-02-01 | 2018-08-31 | 百度在线网络技术(北京)有限公司 | A kind of smart machine service providing method and system |
CN109933502A (en) * | 2019-01-23 | 2019-06-25 | 平安科技(深圳)有限公司 | Electronic device, the processing method of user operation records and storage medium |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112057079A (en) * | 2020-08-07 | 2020-12-11 | 中国科学院深圳先进技术研究院 | Behavior quantification method and terminal based on state and map |
WO2022027590A1 (en) * | 2020-08-07 | 2022-02-10 | 中国科学院深圳先进技术研究院 | State and graph-based behavior quantification method and terminal |
CN112883257A (en) * | 2021-01-11 | 2021-06-01 | 北京达佳互联信息技术有限公司 | Behavior sequence data processing method and device, electronic equipment and storage medium |
WO2022148186A1 (en) * | 2021-01-11 | 2022-07-14 | 北京达佳互联信息技术有限公司 | Behavioral sequence data processing method and apparatus |
CN112883257B (en) * | 2021-01-11 | 2024-01-05 | 北京达佳互联信息技术有限公司 | Behavior sequence data processing method and device, electronic equipment and storage medium |
CN115204322A (en) * | 2022-09-16 | 2022-10-18 | 成都新希望金融信息有限公司 | Behavioral link abnormity identification method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111274462A (en) | Data processing method and device | |
CN109118330B (en) | Household appliance recommendation method and device, storage medium and server | |
EP3037983A1 (en) | Data processing system, data processing method, and data processing device | |
CN110738577B (en) | Community discovery method, device, computer equipment and storage medium | |
US11907659B2 (en) | Item recall method and system, electronic device and readable storage medium | |
CN103106285A (en) | Recommendation algorithm based on information security professional social network platform | |
CN107894827B (en) | Application cleaning method and device, storage medium and electronic equipment | |
CN106201624A (en) | A kind of recommendation method of application program and terminal | |
CN106294219A (en) | A kind of equipment identification, data processing method, Apparatus and system | |
Kang et al. | A service scenario generation scheme based on association rule mining for elderly surveillance system in a smart home environment | |
CN111597241A (en) | Method, device and equipment for data acquisition | |
CN114880560A (en) | Content recommendation method and device, storage medium and electronic device | |
CN114223139B (en) | Interface switching method and device, wearable electronic equipment and storage medium | |
CN114855416A (en) | Recommendation method and device of washing program, storage medium and electronic device | |
CN114154078A (en) | Information recommendation method and device, electronic equipment and storage medium | |
KR20180007248A (en) | Method for frequent itemset mining from uncertain data with different item importance and uncertain weighted frequent item mining apparatus performing the same | |
CN112905937A (en) | Service content updating and generating method based on big data and cloud computing service system | |
CN114595372A (en) | Scene recommendation method and device, computer equipment and storage medium | |
CN106469086B (en) | Event processing method and device | |
CN114861678A (en) | Method and apparatus for determining time information, storage medium, and electronic apparatus | |
CN115599260A (en) | Intelligent scene generation method, device and system, storage medium and electronic device | |
CN111107493A (en) | Method and system for predicting position of mobile user | |
CN113326296B (en) | Load decomposition method and system suitable for industrial and commercial users | |
CN114417988A (en) | Method and apparatus for determining operation information, storage medium, and electronic apparatus | |
CN111669654B (en) | Program recommendation method and device, electronic equipment and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200612 |