CN109002849B

CN109002849B - Method and device for identifying development stage of object

Info

Publication number: CN109002849B
Application number: CN201810732277.1A
Authority: CN
Inventors: 陈冉
Original assignee: Baidu Online Network Technology Beijing Co Ltd
Current assignee: Baidu Online Network Technology Beijing Co Ltd
Priority date: 2018-07-05
Filing date: 2018-07-05
Publication date: 2022-05-17
Anticipated expiration: 2038-07-05
Also published as: CN109002849A

Abstract

The application provides a method and a device for identifying development stages of an object, wherein the method comprises the following steps: acquiring a trained recognition model; the identification model is generated by training a search sequence of a target object after generating a corresponding search sequence for the target object in a preset development stage; the search sequence is used for indicating search behavior data of the corresponding object at a plurality of time points; acquiring a search sequence corresponding to an object to be identified; identifying a search sequence corresponding to an object to be identified by adopting an identification model; and determining whether the object to be identified is in a preset development stage or not according to the information obtained by identification. Whether the object to be recognized is in the preset development stage or not is recognized based on the recognition model, so that the artificial participation degree can be reduced, the recognition efficiency is improved, and the timeliness of object recognition is improved.

Description

Method and device for identifying development stage of object

Technical Field

The present application relates to the field of internet technologies, and in particular, to a method and an apparatus for identifying an object development stage.

Background

Emerging industries, including two services: breakthrough growth business (i.e., new products in the old market, or new models of the old market) and strategic new business development (i.e., creation of new markets, new segments, or even new industries). For emerging industries, the earlier the discovery, the higher the investment risk and the greater the revenue obtained, while the later the discovery, the lower the investment risk and the lesser the revenue obtained. Therefore, forecasting emerging industries and laying out as early as possible is the most efficient investment.

In the prior art, the following two main ways for predicting emerging industries are available: firstly, aiming at Application programs (APP) relevant to various industries, acquiring use indexes of various users, and then identifying emerging industries based on the use indexes; second, emerging industries are identified based on artificial prior knowledge.

In the first mode, the hysteresis of recognition is high; in the second mode, the identification is performed manually, and the identification efficiency is low.

Disclosure of Invention

The application provides a method and a device for identifying an object development stage, which are used for solving the technical problems of high identification hysteresis and low efficiency in the prior art.

An embodiment of a first aspect of the present application provides a method for identifying a development stage of an object, including:

acquiring a trained recognition model; the identification model is generated by training a search sequence of a target object after the corresponding search sequence is generated for the target object in a preset development stage; the search sequence is used for indicating search behavior data of corresponding objects at a plurality of time points;

acquiring a search sequence corresponding to an object to be identified;

identifying a search sequence corresponding to the object to be identified by adopting the identification model;

and determining whether the object to be identified is in the preset development stage or not according to the information obtained by identification.

The method for identifying the development stage of the object comprises the steps of obtaining a trained identification model, wherein the identification model is generated by generating a corresponding search sequence for a target object in a preset development stage and then training the target object by adopting the search sequence of the target object, the search sequence is used for indicating search behavior data of the corresponding object at a plurality of time points, then obtaining the search sequence corresponding to the object to be identified, then identifying the search sequence corresponding to the object to be identified by adopting the identification model, and finally determining whether the object to be identified is in the preset development stage according to information obtained by identification. In the application, whether the object to be recognized is in the preset development stage or not is recognized based on the recognition model, so that the manual participation degree can be reduced, and the recognition efficiency is improved. In addition, when a user meets unknown content or specific requirements, the user is more prone to solving through searching, the searching words with long tail and low frequency characteristics in searching represent the cold requirements of the user, the cold requirements often just appear in a new object and do not appear in the form of the searching words before the APP comes into the market, therefore, the recognition model is trained by adopting the searching sequence of the target object, the development stage of the object to be recognized is recognized by adopting the trained recognition model, the time can be earlier than that of the recognition mode based on the use indexes of the user in the APP, and therefore the timeliness of the object recognition can be improved.

The embodiment of the second aspect of the present application provides an apparatus for identifying a development stage of an object, including:

the first acquisition module is used for acquiring the trained recognition model; the identification model is generated by training a search sequence of a target object after the corresponding search sequence is generated for the target object in a preset development stage; the search sequence is used for indicating search behavior data of corresponding objects at a plurality of time points;

the second acquisition module is used for acquiring a search sequence corresponding to the object to be identified;

the first identification module is used for identifying the search sequence corresponding to the object to be identified by adopting the identification model;

and the determining module is used for determining whether the object to be identified is in the preset development stage or not according to the information obtained by identification.

The device for identifying the development stage of the object, provided by the embodiment of the application, is generated by acquiring a trained identification model, wherein the identification model is generated by generating a corresponding search sequence for a target object in a preset development stage and then training the search sequence of the target object, the search sequence is used for indicating search behavior data of the corresponding object at a plurality of time points, then the search sequence corresponding to the object to be identified is acquired, then the identification model is adopted to identify the search sequence corresponding to the object to be identified, and finally whether the object to be identified is in the preset development stage is determined according to information obtained by identification. In the application, whether the object to be recognized is in the preset development stage or not is recognized based on the recognition model, so that the manual participation degree can be reduced, and the recognition efficiency is improved. In addition, when a user meets unknown content or specific requirements, the user is more prone to solving through searching, the searching words with long tail and low frequency characteristics in searching represent the cold requirements of the user, the cold requirements often just appear in a new object and do not appear in the form of the searching words before the APP comes into the market, therefore, the recognition model is trained by adopting the searching sequence of the target object, the development stage of the object to be recognized is recognized by adopting the trained recognition model, the time can be earlier than that of the recognition mode based on the use indexes of the user in the APP, and therefore the timeliness of the object recognition can be improved.

An embodiment of a third aspect of the present application provides a computer device, including: the present invention relates to a method for identifying a development stage of an object, and to a computer program stored on a memory and executable on a processor, which when executed by the processor performs the method as set forth in the embodiments of the first aspect of the present application.

In order to achieve the above object, a fourth aspect of the present application provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is configured to, when executed by a processor, implement a method for identifying a development stage of an object as set forth in an embodiment of the first aspect of the present application.

In order to achieve the above object, a fifth aspect of the present application provides a computer program product, wherein instructions of the computer program product, when executed by a processor, perform the method for identifying development stages of an object as provided in the first aspect of the present application.

Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.

Drawings

The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

fig. 1 is a flowchart illustrating a method for identifying a development stage of an object according to an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of the trend of "short lease" within one year in the example of the present application;

FIG. 3 is a schematic diagram of a PVR curve of a "Mobai bicycle" after smoothing in an example of the present application;

fig. 4 is a flowchart illustrating a method for identifying a development stage of an object according to a second embodiment of the present application;

FIG. 5a is a schematic diagram illustrating the PVR trend of "fire victims may be burning a cartoon evil" in the embodiment of the present application;

FIG. 5b is a schematic diagram showing the PVR trend of "garnet" in the examples of the present application;

FIG. 5c is a schematic diagram of PVR trend of the "seal character" in the embodiment of the present application;

FIG. 5d is a schematic diagram of the PVR trend of "11-out-of-5 clever play" in the embodiment of the present application;

FIG. 6 is a schematic diagram illustrating an accuracy rate variation curve of an identification model in a process of iteratively updating a test set according to an embodiment of the present application;

fig. 7 is a flowchart illustrating a method for identifying a development stage of an object according to a third embodiment of the present application;

FIG. 8 is a schematic diagram illustrating the mutual information size between "shared cars" and various groups of people in the embodiment of the present application;

fig. 9 is a schematic structural diagram of an apparatus for identifying a development stage of an object according to a fourth embodiment of the present application;

fig. 10 is a schematic structural diagram of an apparatus for identifying a development stage of an object according to a fifth embodiment of the present application

FIG. 11 illustrates a block diagram of an exemplary computer device suitable for use in implementing embodiments of the present application.

Detailed Description

Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.

Currently, predicting and identifying emerging industries mainly comprises the following two ways:

the first mode is to collect usage indexes of various users for APPs related to various industries, where the usage indexes may be, for example, data for downloading and using APPs for users, including Daily Active User number (DAU), Weekly Active User number (Weekly Active User, WAU for short), Monthly Active User number (madhly Active User, MAU for short), DAU/MAU, total usage duration, average Daily usage duration, usage times, average Daily usage times, and the like. And then determining emerging industries by analyzing the change of the use indexes and the change rate of the ring ratio ranking and combining a predictive analysis algorithm.

And in the second mode, manual comprehensive judgment is carried out based on prior knowledge, specifically, business covered by the industry is screened, and meanwhile, the emerging industry is determined by combining related industry analysis reports, third-party analysis data, experience of the industry and other prior knowledge.

The first mode of carrying out identification based on the use indexes of the users has low identification accuracy and high hysteresis, for example, for APP in an emerging industry, when DAU reaches a certain magnitude and the number of users is increased explosively, the emerging industry is already in an initial scale to form certain competitiveness, and at the moment, the investment value is low; and the second mode of identification based on artificial priori knowledge has lower identification efficiency.

The embodiment of the application mainly aims at the technical problems of low identification accuracy, low efficiency and high hysteresis in the prior art, and provides a method for identifying the development stage of an object.

The method for identifying the development stage of the object comprises the steps of obtaining a trained identification model, wherein the identification model is generated by generating a corresponding search sequence for a target object in a preset development stage and then training the target object by adopting the search sequence of the target object, the search sequence is used for indicating search behavior data of the corresponding object at a plurality of time points, then obtaining the search sequence corresponding to the object to be identified, then identifying the search sequence corresponding to the object to be identified by adopting the identification model, and finally determining whether the object to be identified is in the preset development stage according to information obtained by identification.

In the application, whether the object to be recognized is in the preset development stage or not is recognized based on the recognition model, so that the manual participation degree can be reduced, and the recognition efficiency is improved. In addition, when a user meets unknown content or specific requirements, the user is more prone to solving through searching, the searching words with long tail and low frequency characteristics in searching represent the cold requirements of the user, the cold requirements often just appear in a new object and do not appear in the form of the searching words before the APP comes into the market, therefore, the recognition model is trained by adopting the searching sequence of the target object, the development stage of the object to be recognized is recognized by adopting the trained recognition model, the time can be earlier than that of the recognition mode based on the use indexes of the user in the APP, and therefore the timeliness of the object recognition can be improved.

A method and apparatus for identifying a development stage of an object according to an embodiment of the present application will be described below with reference to the accompanying drawings. Before describing embodiments of the present invention in detail, for ease of understanding, common terminology will be introduced first:

the long tail theory has the basic principle that: as long as the channels for storing and circulating the products are large enough, the share of the market occupied by the products with low demand or low sales can be equal to the share of the market occupied by a few hot-selling products, and even larger, namely, a plurality of small markets are converged into a river, and the market energy which is equal to the main stream can be generated. For example, for a search engine, although a few core keywords or general keywords may bring about more than half of the visit amount to a website, the number of the search people is small, but the sum of very specific keywords, i.e. keywords with long-tailed low-frequency characteristics, can also bring about a considerable visit amount to the website, and the conversion rate of customers formed by retrieval of these keywords with long-tailed low-frequency characteristics is higher and often much higher than that of general keywords.

Mutual information is a measure reflecting the interdependence between two random variables in probability theory and information theory.

Fig. 1 is a flowchart illustrating a method for identifying a development stage of an object according to an embodiment of the present disclosure.

As shown in fig. 1, the method for identifying the development stage of an object includes the following steps:

step 101, acquiring a trained recognition model; the identification model is generated by training a search sequence of a target object after generating a corresponding search sequence for the target object in a preset development stage; the search sequence is used to indicate search behavior data of the corresponding object at a plurality of time points.

It will be appreciated that the subject may be in three stages of development, respectively: the development early stage, the development middle stage and the development later stage, wherein the object can be an enterprise, an industry and the like. In this embodiment of the application, the preset development stage may be an initial development stage, and the target object is an object in the initial development stage, for example, when the object is an industry, the target object may be an emerging industry, such as a short video, a shared bicycle, a shared automobile, and the like.

It should be noted that, in a search scenario, search behavior data of a user may be generated, and according to the search behavior data, a user's requirement for specific information may be determined, where there may be an emerging requirement. According to the long-tailed theory, the search word with the long-tailed low-frequency characteristic represents the cold demand of the user, and the cold demand may include emerging demands, and the cold demand generally appears in the form of the search word just before a new thing appears and the APP is on the market. Therefore, search words with relatively high low-frequency medium search volume (PV) are mined, the development stage of the object is determined based on the search words, and the timeliness of object identification can be improved earlier than the development stage of object identification based on the use index of the user in APP in terms of time.

In the embodiment of the application, firstly, a search word for searching a target object can be obtained, then, a search sequence corresponding to the target object is generated according to search behavior data corresponding to the search word, and then, the search sequence of the target object is adopted to train a recognition model, so that the development stage of the object can be recognized by using the recognition model, wherein the search sequence is used for indicating the search behavior data of the corresponding object at a plurality of time points.

The search behavior data may include a search volume PV or a search ratio (PVR).

Specifically, the search word for searching for the target object has a similar form on the search volume PV for the target object in the preset development stage, for example, the search volume PV of the search word for searching for the target object has a sudden increase trend in a period of time, and the search volume PV is in a low-frequency fluctuation state in a long period of time. For example, when the search term used for searching for the target object is a short term, the short term is a search term with a long tail and a low frequency characteristic, and a PV trend graph of the search term within one year can be shown in fig. 2. As can be seen from fig. 2, the search volume PV of the short lease starts from 3 months of 2017, and shows a sudden growth trend.

Since the suddenly growing states are only related to the state at the latest time and conform to the Markov property, as a possible implementation manner, the recognition Model may be a Hidden Markov Model (HMM for short) with a preset number of Hidden states; wherein, the number of the hidden states ranges from 4 to 6.

It should be noted that, since the search volume PV of the sudden hot spot event also shows a sudden increase trend, the difference from the search word for searching the target object on the PV trend graph is: the durations are different, wherein the duration of the search word for searching the target object on the PV trend graph is longer, and the duration of the event on the PV trend graph is shorter when the hotspot is suddenly generated. Therefore, in the present application, to avoid the influence of an unexpected hot spot event, smoothing may be performed on the search behavior data, and specifically, a search sequence corresponding to the target object may be generated according to the search behavior data corresponding to the search word used for searching the target object. Therefore, the search sequence of the target object can be adopted to train the recognition model so as to improve the accuracy of recognition of the recognition model.

Specifically, a search word for searching for a target object may be obtained, and then search behavior data corresponding to each second duration may be generated according to search behavior data of the search word in each first duration; and the second time length comprises a plurality of first time lengths, and the search behavior data in each second time length is used as the search behavior data of the corresponding time point in the search sequence.

For example, the average value of the search ratios PVR in each first time period may be used as the search behavior data of the corresponding time point, so that a plurality of search behavior data in the second time period may be obtained, and the obtained plurality of search behavior data may be used as the search behavior data of the corresponding time point in the search sequence, that is, the average value of the search ratios PVR in each first time period may be used as the search behavior data of the corresponding time point in the search sequence.

As an example, referring to fig. 3, fig. 3 is a schematic diagram of a PVR curve after smoothing processing in an embodiment of the present application. The search terms used for searching the target object are as follows: the average PVR value in the first 14 days before each time point can be used as the search behavior data of the time point in the search sequence, assuming that the first time period is 14 days and the second time period is 6 months.

In the embodiment of the application, after the search sequence corresponding to the target object is generated, the recognition model can be trained by using the search sequence to obtain the trained recognition model. Specifically, a training set may be generated according to a search sequence corresponding to the target object, and a label of the search sequence in the training set is obtained, where the label is used to indicate a trend of search behavior data over time, and the label may include: simple rises, simple falls, complex rises, complex falls, and waves. The recognition model may then be trained using the training set.

And 102, acquiring a search sequence corresponding to the object to be identified.

In the embodiment of the application, the object to be identified is an object which needs to be identified in a development stage.

In the embodiment of the application, the search sequence corresponding to the object to be identified is generated according to the search behavior data corresponding to the search word for searching the object to be identified. As one possible implementation manner, a search word for searching for an object to be recognized may be obtained, where the search word has a long-tailed low-frequency characteristic, for example, when the object to be recognized is a shared vehicle, the search word may be a shared bicycle, a shared trolley, a shared car, or the like. Then, generating search behavior data corresponding to each second time length according to the search behavior data of the search word in each first time length; and the second time length comprises a plurality of first time lengths, and the search behavior data in each second time length is used as the search behavior data of the corresponding time point in the search sequence.

For example, the average value of the search ratio PVR in each first time period may be used as the search behavior data of the corresponding time point, so that a plurality of search behavior data in the second time period may be obtained, and the obtained plurality of search behavior data may be used as the search behavior data of the corresponding time point in the search sequence. That is, the PVR average value in the first time period before each time point may be used as the search behavior data of each corresponding time point in the search sequence. Assuming that the first duration is 14 days and the second duration is 6 months, the PVR average value within 14 days before each time point may be used as the search behavior data for the time point in the search sequence.

And 103, identifying the search sequence corresponding to the object to be identified by adopting the identification model.

In the embodiment of the application, after the search sequence corresponding to the object to be recognized is obtained, the recognition model can be adopted to recognize the search sequence corresponding to the object to be recognized. Specifically, the search sequence corresponding to the object to be recognized may be input to the recognition model, so as to obtain the recognition result.

For example, when the recognition model is an HMM, after the HMM is trained, a hidden state sequence corresponding to a search sequence in the training set can be learned. And aiming at the training set, the HMM model obtains the corresponding relation between the label and the hidden state sequence according to the label corresponding to the search sequence and the hidden state sequence corresponding to the search sequence. For example, the HMM has 5 hidden states, such as ABCDE, and marks the search sequence (PVR sequence) corresponding to the search word, assuming that the search word is: the cars are shared, and the PVR sequence corresponding to the search word is as follows: (0.005%, 0.008%, 0.014%, 0.025%, 0.1%, 0.1%), then the corresponding hidden state sequence may be: A-C-B-D-E. According to the corresponding relationship, the HMM may identify the search sequence corresponding to the object to be identified, and the obtained identification result may be: and (4) predicting and marking the object to be identified corresponding to the search sequence.

And 104, determining whether the object to be identified is in a preset development stage or not according to the information obtained by identification.

In the embodiment of the application, an identification model is adopted to identify the search sequence corresponding to the object to be identified, the information obtained by identification can comprise the prediction label of the object to be identified, and when the prediction label of the object to be identified indicates that the search behavior data is increased along with time in the information obtained by identification, the object to be identified can be determined to be in a preset development stage; and when the prediction marking of the object to be recognized in the recognized information indicates that the search behavior data does not rise along with the time, determining that the object to be recognized is not in the preset development stage.

Further, in order to improve the accuracy of recognition, the information obtained by recognition may further include a confidence level of the object to be recognized, and when the prediction label of the object to be recognized indicates that the search behavior data rises with time and the confidence level corresponding to the object to be recognized is greater than a second threshold value, it is determined that the object to be recognized is in a preset development stage; and when the corresponding confidence coefficient is not greater than the second threshold value, determining that the object to be recognized is not in the preset development stage.

The method for identifying the development stage of the object comprises the steps of obtaining a trained identification model, wherein the identification model is generated by generating a corresponding search sequence for a target object in a preset development stage and then training the target object by adopting the search sequence of the target object, the search sequence is used for indicating search behavior data of the corresponding object at a plurality of time points, then obtaining the search sequence corresponding to the object to be identified, then identifying the search sequence corresponding to the object to be identified by adopting the identification model, and finally determining whether the object to be identified is in the preset development stage according to information obtained by identification. In the application, whether the object to be recognized is in the preset development stage or not is recognized based on the recognition model, the manual participation degree can be reduced, and the recognition efficiency is improved. In addition, when a user meets unknown content or specific requirements, the user is more prone to solving through searching, the searching words with long tail and low frequency characteristics in searching represent the cold requirements of the user, the cold requirements often just appear in a new object and do not appear in the form of the searching words before the APP comes into the market, therefore, the recognition model is trained by adopting the searching sequence of the target object, the development stage of the object to be recognized is recognized by adopting the trained recognition model, the time can be earlier than that of the recognition mode based on the use indexes of the user in the APP, and therefore the timeliness of the object recognition can be improved.

In the embodiment of the present application, before obtaining the trained recognition model, the recognition model may be trained, and the above process is described in detail with reference to fig. 4.

Fig. 4 is a flowchart illustrating a method for identifying a development stage of an object according to a second embodiment of the present application.

As shown in fig. 4, the method for identifying the development stage of an object may include the steps of:

step 201, generating a training set according to a search sequence corresponding to a target object, and acquiring a label of the search sequence in the training set; the callout is used to indicate a trend in search behavior data over time.

In the embodiment of the present application, the labeling may include: simple rises, simple falls, complex rises, complex falls, and waves.

In the embodiment of the present application, the search sequence corresponding to the target object is generated according to the search term used for searching the target object, and the specific execution process may refer to the execution process in step 101 in the above embodiment, which is not described herein again.

As a possible implementation manner, the search sequences in the training set can be labeled manually according to the time-varying trend of the search behavior data.

Step 202, generating a test set according to the search sequences corresponding to the test objects in each development stage, and obtaining the labels of the search sequences in the test set.

In the embodiment of the application, the test object is an object in the early development stage, the middle development stage and/or the later development stage.

Specifically, a search word for searching the test object may be acquired, and then, according to the search word, the search behavior data of the corresponding time point in the search sequence may be determined. The specific execution process is similar to the generation process of the search sequence corresponding to the target object, and is not described herein again.

As a possible implementation manner, the search sequences in the test set can be labeled manually according to the time-varying trend of the search behavior data.

And step 203, training the recognition model by adopting the training set.

And 204, testing the trained recognition model by adopting the test set to obtain a prediction label of the test set.

For example, when the recognition model is an HMM, after the HMM is trained, the HMM can learn to obtain a hidden state sequence corresponding to a search sequence in a training set, and for the training set, the HMM model obtains a correspondence between a label and the hidden state sequence according to the label corresponding to the search sequence and the hidden state sequence corresponding to the search sequence. According to the corresponding relation, the HMM can identify the search sequences in the test set, and the prediction labels can be obtained.

Step 205, obtaining the performance parameters of the recognition model according to the difference between the prediction label and the label obtained when the test set is generated.

In the embodiment of the present application, the performance parameters may include parameters such as accuracy and/or recall.

It is understood that when the difference is larger, the recognition model is indicated to be recognized with lower accuracy, and the performance parameter of the recognition model is lower, whereas when the difference is smaller, the recognition model is indicated to be recognized with higher accuracy, and the performance parameter of the recognition model is higher.

Step 206, generating a candidate set according to the search sequence corresponding to the candidate object; wherein the candidate object is different from the target object and the test object, and the search behavior data of the candidate object rises with time.

It should be noted that, because the search sequences corresponding to the target object are few and collection is difficult, if the recognition model is trained by using the training set alone, the problem of insufficient training set occurs, and the accuracy of recognition of the recognition model is reduced. Therefore, in the present application, a candidate object having similarity to the target object may be determined, and the search volume PV of the candidate object also exhibits a sudden increase tendency, that is, the search behavior data of the candidate object increases with time. Therefore, the training set can be expanded by utilizing the search sequence corresponding to the candidate object, specifically, the training set can be expanded by adopting a semi-supervised learning method, wherein the semi-supervised learning method can train the recognition model by utilizing a small amount of marked data and a large amount of unmarked data.

It will be appreciated that search terms used to search for candidates may also express an increase in demand, but not emerging demands, such as a television show or a gradual explosion of a novel. For example, referring to fig. 5a, the search terms used to search for candidates are: the fire picture is worried by the fire picture, and the searching behavior data (PVR) of the candidate object rises along with time; referring to fig. 5b, the search terms used to search for the candidate are: garnet, the search behavior data of the candidate object rises with time; referring to fig. 5c, the search terms used to search for the candidate object are: small seal character, the search behavior data of the candidate object rises along with time; referring to fig. 5d, the search terms used to search for the candidate object are: and 11, selecting 5 from the most clever playing methods, and the search behavior data of the candidate object rises along with time. It can be seen that although the search behavior data of the candidate objects in fig. 5a-5d is rising with time, the candidate objects are not in the preset development stage.

And step 207, identifying the candidate set by adopting the trained identification model to obtain the prediction label and the confidence coefficient of the search sequence corresponding to the candidate object.

For example, when the recognition model is an HMM, after the HMM is trained, the HMM can learn to obtain a hidden state sequence corresponding to a search sequence in a training set, and for the training set, the HMM model obtains a correspondence between a label and the hidden state sequence according to the label corresponding to the search sequence and the hidden state sequence corresponding to the search sequence. According to the corresponding relation, the HMM can identify the search sequences in the candidate set, and the prediction labels of the search sequences corresponding to the candidate objects can be obtained.

Further, the trained recognition model is used to recognize the candidate set, and the confidence of the search sequence corresponding to the candidate object can be obtained, when the confidence is higher, it indicates that the search sequence corresponding to the candidate object is not suitable for being used as a training sample, at this time, the search sequence corresponding to the candidate object may not be added to the training set, and when the confidence is higher, step 208 may be triggered.

In step 208, if the predicted label of the search sequence corresponding to the candidate object indicates that the search behavior data is rising with time and the confidence is greater than the first threshold, the search sequence corresponding to the candidate object is added to the training set.

Wherein the first threshold is preset.

In the embodiment of the application, when the prediction label of the search sequence corresponding to the candidate object indicates that the search behavior data rises along with time and the confidence degree is greater than a first threshold, the search sequence corresponding to the candidate object may be added to the training set, so as to perform next training on the recognition model by using the newly generated training set.

And step 209, returning to execute the steps of training the recognition model by using the training set and testing the trained recognition model by using the test set to obtain the performance parameter until the performance parameter is lower than the second threshold value.

Wherein the second threshold is preset.

It should be noted that, when the recognition model is trained by using the newly generated training set, the accuracy and the recall rate are recalculated on the test set, and the accuracy of the recognition model decreases as the number of search sequences in the test set increases. Since the lower the performance parameter is, the lower the accuracy of the recognition model recognition is, and the higher the performance parameter is, the higher the accuracy of the recognition model recognition is, in order to ensure the accuracy of the recognition model recognition, in the present application, when the performance parameter is lower, the testing of the recognition model may be stopped.

As an example, referring to fig. 6, fig. 6 is a schematic diagram of an accuracy rate variation curve of an identification model in a process of iteratively updating a test set in an embodiment of the present application. With the increase of search sequences in the test set, the accuracy of the identification model is reduced, and in order to ensure the identification accuracy of the identification model, the test on the identification model can be stopped when the performance parameter is low.

And step 210, when the performance parameter is lower than a second threshold value, selecting the optimal performance parameter from the performance parameters obtained after the recognition model is trained by adopting the training set in the previous cycle execution.

In the embodiment of the application, in order to ensure the accuracy of recognition of the recognition model, the optimal performance parameters can be selected from the performance parameters obtained after the recognition model is trained by adopting the training set through cyclic execution, and then the recognition model with the optimal performance parameters can be used as the trained recognition model, so that the accuracy of recognition can be improved.

As an example, referring to fig. 6, when the point a is reached, the performance parameter of the recognition model is optimal, and the recognition model corresponding to the point a may be used as a trained recognition model.

And step 211, taking the recognition model with the optimal performance parameters as a trained recognition model.

Step 212, a search sequence corresponding to the object to be identified is obtained.

And step 213, identifying the search sequence corresponding to the object to be identified by using the identification model.

And 214, determining whether the object to be identified is in a preset development stage according to the information obtained by identification.

In the embodiment of the application, the identification model is adopted to identify the search sequence corresponding to the object to be identified, and the information obtained by identification can include the prediction label and the confidence of the object to be identified. Determining whether the object to be identified is in a preset development stage according to the information obtained by identification, wherein the method specifically comprises the following steps:

and if the prediction marking of the object to be recognized in the recognized information indicates that the search behavior data does not rise along with the time, the object to be recognized is not in a preset development stage.

If the prediction marking of the object to be recognized indicates that the search behavior data rises along with the time in the information obtained by recognition, whether the object to be recognized is in the preset development stage can be further determined according to the corresponding confidence coefficient in the information obtained by recognition. Specifically, when the corresponding confidence is greater than the second threshold, it is determined that the object to be recognized is in the preset development stage, and when the corresponding confidence is not greater than the second threshold, it is determined that the object to be recognized is not in the preset development stage.

According to the method for identifying the development stage of the object, the trained identification model is tested by adopting the test set, and the performance parameters of the identification model can be determined. And if the predicted label of the search sequence corresponding to the candidate object indicates that the search behavior data rises along with the time and the confidence coefficient is greater than a first threshold value, adding the search sequence corresponding to the candidate object into the training set, thereby solving the problem of insufficient training set. When the performance parameter is lower than the second threshold value, the optimal performance parameter is selected from the performance parameters obtained after the recognition model is trained by circularly executing the training set all the time, and the recognition model with the optimal performance parameter is used as the trained recognition model, so that the recognition accuracy of the recognition model can be improved.

As an application scenario, when the recognition model is an HMM, the HMM is trained to learn to obtain a hidden state sequence corresponding to the observation sequence, wherein search sequences in a training set, a test set and a candidate set can be regarded as the observation sequence. For the training set, the HMM model may obtain a correspondence between the label and the hidden state sequence according to the label corresponding to the observation sequence and the hidden state sequence corresponding to the observation sequence, for example, the HMM has 5 hidden states, for example, when ABCDE is used, a search sequence (PVR sequence) corresponding to a search word is labeled. Assume that the search term is: the cars are shared, and the PVR sequence corresponding to the search word is as follows: (0.005%, 0.008%, 0.014%, 0.025%, 0.1%, 0.1%), then the corresponding hidden state sequence may be: A-C-B-D-E. Therefore, according to the corresponding relation, the search sequences in the test set and the candidate set can be identified to obtain the prediction label.

As a possible implementation manner, when the preset development stage is specifically the initial development stage, referring to fig. 7, after

step

104 or 213, the method for identifying the development stage of the object may further include the following steps:

step 301, when the object to be identified is determined to be in the initial development stage, determining the correlation between the object to be identified and at least two groups of people.

Alternatively, assume that a target object, e.g., emerging industry, is emerging from a small circle, such as a specific circle of young people, first-line city, etc. Based on such an assumption, the correlation of the object to be recognized with at least two groups of people can be calculated using mutual information.

As a possible implementation manner, groups of people can be obtained by adopting different attributes of the same dimension, wherein the dimension includes one or more combinations of cities, sexes and ages, and then mutual information between the groups of people and the object to be identified is calculated. Optionally, the mutual information between each group of people and the object to be identified may be calculated according to the following formula:

wherein, X represents PV in the search behavior data indicated by the search sequence of the object to be identified, Y represents a certain attribute of a certain dimension, p (X) represents an edge probability distribution function of X, p (Y) represents an edge probability distribution function of Y, and p (X, Y) represents a joint probability distribution function of X and Y.

After determining the mutual information, a correlation may be determined based on the mutual information. Specifically, the larger the mutual information is, the higher the correlation is, which indicates that the search behavior data corresponding to the object to be recognized has an obvious bias, the more likely the object to be recognized is to be in the preset development stage, and the smaller the mutual information is, the lower the correlation is, which indicates that the object to be recognized is less likely to be in the preset development stage.

Step 302, according to the correlation between the object to be identified and at least two groups of people, the development stage of the object to be identified is verified.

Since the target object, e.g. the emerging industry, is emerging from a small circle, such as a specific circle of a young population or a first-line city, the check is determined to pass when the correlation of the object to be identified with one of the at least two groups of people is above a threshold correlation and the correlation of the object to be identified with the other of the at least two groups of people is below a threshold. That is, only when the correlation between the object to be recognized and a certain group of people divided by a certain dimension is high, the object to be recognized is determined to be in the preset development stage. Wherein the threshold is preset.

As an example, referring to fig. 8, for a dimension being a city, the groups of people obtained by dividing are: the first-line city crowd, the second-line city crowd, the third-line city crowd, the fourth-line crowd and the overseas crowd, and the search words for searching the objects to be identified are as follows: the shared automobile comprises the following mutual information of an object to be identified and first-line city crowds, second-line city crowds, third-line city crowds, fourth-line city crowds and overseas crowds: 0.95, 0.09, -0.94, -0.66. Therefore, the object to be recognized is only highly correlated with the first-line city population, and therefore, the object to be recognized can be determined to be in the preset development stage.

In the embodiment of the application, when the object to be recognized is determined to be in the initial development stage, the correlation between the object to be recognized and at least two groups of crowds is determined, and the development stage of the object to be recognized is verified according to the correlation between the object to be recognized and at least two groups of crowds. Therefore, the identification accuracy is improved, meanwhile, the manual participation degree is reduced, and the identification efficiency is improved.

In order to implement the above embodiments, the present application further provides an apparatus for identifying a development stage of an object.

Fig. 9 is a schematic structural diagram of an apparatus for identifying a development stage of an object according to a fourth embodiment of the present application.

As shown in fig. 9, the apparatus 100 for identifying the development stage of an object includes: a first obtaining module 101, a second obtaining module 102, a first identifying module 103, and a determining module 104. Wherein the content of the first and second substances,

a first obtaining module 101, configured to obtain a trained recognition model; the identification model is generated by training a search sequence of a target object after generating a corresponding search sequence for the target object in a preset development stage; the search sequence is used to indicate search behavior data of the corresponding object at a plurality of time points.

The second obtaining module 102 is configured to obtain a search sequence corresponding to an object to be identified.

As a possible implementation manner, the second obtaining module 102 is specifically configured to: acquiring a search word for searching an object to be identified; generating search behavior data corresponding to each second time length according to the search behavior data of the search word in each first time length; wherein the second time length comprises a plurality of first time lengths; and taking the search behavior data in each second time length as the search behavior data of the corresponding time point in the search sequence.

The first identification module 103 is configured to identify a search sequence corresponding to an object to be identified by using an identification model.

And the determining module 104 is configured to determine whether the object to be identified is in a preset development stage according to the information obtained by the identification.

Further, in a possible implementation manner of the embodiment of the present application, referring to fig. 10, on the basis of the embodiment shown in fig. 9, the apparatus 100 for identifying a development stage of an object may further include:

a first generating module 105, configured to generate a training set according to a search sequence corresponding to a target object before acquiring the trained recognition model, and acquire a label of the search sequence in the training set; the callout is used to indicate a trend of the search behavior data over time.

The second generating module 106 is configured to generate a test set according to the search sequence corresponding to the test object at each development stage, and obtain a label of the search sequence in the test set.

And the training module 107 is configured to train the recognition model by using a training set.

And the test module 108 is configured to test the trained recognition model by using the test set to obtain a prediction label of the test set.

And the processing module 109 is configured to obtain a performance parameter of the identification model according to a difference between the prediction label and the label obtained when the test set is generated.

As a possible implementation manner, when the preset development stage is specifically the initial development stage, the apparatus 100 for identifying the development stage of the object may further include:

a third generating module 110, configured to generate a candidate set according to a search sequence corresponding to a candidate object; wherein the candidate object is different from the target object and the test object, and the search behavior data of the candidate object rises with time.

And the second identification module 111 is configured to identify the candidate set by using the trained identification model, so as to obtain a prediction label and a confidence of the search sequence corresponding to the candidate object.

And an adding module 112, configured to identify the candidate set by using the trained identification model, so as to obtain a prediction label and a confidence of the search sequence corresponding to the candidate object.

And the execution module 113 is configured to return to execute the steps of training the recognition model by using the training set, and testing the trained recognition model by using the test set to obtain the performance parameter until the performance parameter is lower than the second threshold.

The first obtaining module 101 is specifically configured to: when the performance parameters are lower than a second threshold value, selecting the optimal performance parameters from the performance parameters obtained after training the recognition model by adopting the training set in the previous cycle execution; and taking the recognition model with the optimal performance parameters as a trained recognition model.

As a possible implementation manner, if the information obtained by the recognition includes a prediction label and a confidence of the object to be recognized, the determining module 104 is specifically configured to: in the information obtained by identification, if the prediction marking of the object to be identified indicates that the search behavior data is rising along with the time and the corresponding confidence coefficient is greater than a second threshold value, the object to be identified is determined to be in a preset development stage.

the checking module 114 is used for determining the correlation between the object to be identified and at least two groups of crowds when the object to be identified is determined to be in the initial development stage after determining whether the object to be identified is in the preset development stage; and verifying the development stage of the object to be recognized according to the correlation between the object to be recognized and at least two groups of crowds.

As a possible implementation manner, the checking module 114 is specifically configured to: and if the correlation between the object to be identified and one group of the at least two groups of people is higher than the threshold correlation and the correlation between the object to be identified and other groups of people in the at least two groups of people is lower than the threshold, determining that the verification is passed.

As another possible implementation manner, the checking module 114 is specifically configured to: dividing different attributes with the same dimension to obtain each group of people; the dimensions include one or more combinations of cities, gender, and age; calculating mutual information of each group of people and the object to be identified; and determining the correlation according to the mutual information.

As a possible implementation manner, the recognition model is a hidden markov model with a preset number of hidden states; wherein, the number of the hidden states ranges from 4 to 6.

As a possible implementation, the search behavior data includes a search volume PV or a search ratio PVR.

It should be noted that the foregoing explanation of the embodiment of the method for identifying the development stage of an object is also applicable to the apparatus 100 for identifying the development stage of an object in this embodiment, and is not repeated herein.

In order to implement the foregoing embodiments, the present application also provides a computer device, including: memory, processor and computer program stored on the memory and executable on the processor, which when executed by the processor, performs the method of identifying stages of development of an object as set forth in the foregoing embodiments of the application.

In order to achieve the above embodiments, the present application also proposes a non-transitory computer-readable storage medium having a computer program stored thereon, characterized in that the program, when executed by a processor, implements the method of identifying a development stage of an object as proposed in the previous embodiments of the present application.

In order to implement the foregoing embodiments, the present application further proposes a computer program product, wherein when the instructions of the computer program product are executed by a processor, the method for identifying development stages of an object as proposed by the foregoing embodiments of the present application is executed.

FIG. 11 illustrates a block diagram of an exemplary computer device suitable for use in implementing embodiments of the present application. The computer device 12 shown in fig. 11 is only an example, and should not bring any limitation to the function and the use range of the embodiment of the present application.

As shown in FIG. 11, computer device 12 is embodied in the form of a general purpose computing device. The components of computer device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.

Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.

Computer device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.

Memory 28 may include computer system readable media in the form of volatile Memory, such as Random Access Memory (RAM) 30 and/or cache Memory 32. Computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 11, and commonly referred to as a "hard drive"). Although not shown in FIG. 11, a disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a Compact disk Read Only Memory (CD-ROM), a Digital versatile disk Read Only Memory (DVD-ROM), or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the application.

A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally perform the functions and/or methodologies of the embodiments described herein.

Computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with computer device 12, and/or with any devices (e.g., network card, modem, etc.) that enable computer device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Moreover, computer device 12 may also communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public Network such as the Internet) via Network adapter 20. As shown, network adapter 20 communicates with the other modules of computer device 12 via bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with computer device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

The processing unit 16 executes various functional applications and data processing, such as implementing the method of identifying the development stage of an object mentioned in the foregoing embodiments, by executing a program stored in the system memory 28.

In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims

1. A method of identifying a stage of development of a subject, the method comprising the steps of:

acquiring a search sequence corresponding to an object to be identified;

determining whether the object to be identified is in the preset development stage or not according to the information obtained by identification;

the acquiring of the search sequence corresponding to the object to be identified includes:

acquiring a search word for searching the object to be identified;

generating search behavior data corresponding to each second time length according to the search behavior data of the search word in each first time length; wherein the second duration comprises a plurality of the first durations;

and taking the search behavior data in each second time length as the search behavior data of the corresponding time point in the search sequence.

2. The method of identifying stages of development of an object of claim 1, wherein said obtaining a trained recognition model is preceded by:

generating a training set according to the search sequence corresponding to the target object, and acquiring the label of the search sequence in the training set; the label is used for indicating the trend of the search behavior data along time;

generating a test set according to the search sequence corresponding to the test object in each development stage, and acquiring the label of the search sequence in the test set;

training the recognition model by adopting the training set;

testing the trained recognition model by adopting the test set to obtain a prediction label of the test set;

and obtaining the performance parameters of the recognition model according to the difference between the prediction label and the label acquired when the test set is generated.

3. The method for identifying a development stage of an object according to claim 2, wherein the predetermined development stage is specifically an initial development stage, and after obtaining the performance parameters of the identification model, the method further comprises:

generating a candidate set according to the search sequence corresponding to the candidate object; wherein the candidate object is different from the target object and the test object, and the search behavior data of the candidate object rises with time;

adopting a trained recognition model to recognize the candidate set to obtain a prediction label and a confidence coefficient of a search sequence corresponding to the candidate object;

if the prediction label of the search sequence corresponding to the candidate object indicates that the search behavior data is increased along with the time and the confidence degree is greater than a first threshold value, adding the search sequence corresponding to the candidate object into the training set;

and returning to execute the steps of training the recognition model by using the training set and testing the trained recognition model by using the test set to obtain the performance parameters until the performance parameters are lower than a second threshold value.

4. The method of identifying stages of development of an object according to claim 3, wherein said obtaining a trained recognition model comprises:

when the performance parameter is lower than the second threshold value, selecting an optimal performance parameter from the performance parameters obtained after the recognition model is trained by adopting the training set in the previous cycle execution;

and taking the recognition model with the optimal performance parameters as the trained recognition model.

5. The method for identifying the development stage of an object according to claim 3, wherein the information obtained by identification comprises a prediction label and a confidence of the object to be identified; the determining whether the object to be identified is in the preset development stage according to the information obtained by identification comprises the following steps:

in the information obtained by the identification, if the prediction marking of the object to be identified indicates that the search behavior data is rising along with the time and the corresponding confidence coefficient is greater than a second threshold value, the object to be identified is determined to be in the preset development stage.

6. The method for identifying a development stage of an object according to any of claims 1 to 3, wherein the predetermined development stage is specifically an initial development stage, and the determining whether the object to be identified is in the predetermined development stage further comprises:

when the object to be recognized is determined to be in the early development stage, determining the correlation between the object to be recognized and at least two groups of people;

and verifying the development stage of the object to be identified according to the correlation between the object to be identified and at least two groups of crowds.

7. The method for identifying the development stage of an object according to claim 6, wherein the verifying the development stage of the object to be identified according to the correlation between the object to be identified and at least two groups of people comprises:

and if the correlation between the object to be identified and one of the at least two groups of people is higher than the threshold correlation and the correlation between the object to be identified and the other of the at least two groups of people is lower than the threshold, determining that the verification is passed.

8. The method of identifying an object's developmental stage according to claim 6, wherein said determining the relevance of the object to be identified to at least two groups of people comprises:

dividing different attributes with the same dimension to obtain each group of people; the dimensions include one or more combinations of cities, gender, and age;

calculating mutual information of each group of people and the object to be identified;

and determining the correlation according to the mutual information.

9. A method of identifying a stage in the development of an object according to any of claims 1 to 3, characterized in that the identification model is a hidden markov model with a preset number of hidden states; wherein, the number of the hidden states ranges from 4 to 6.

10. A method for identifying a development stage of an object according to any of claims 1-3, characterized in that the search behavior data comprises a search volume PV or a search ratio PVR.

11. An apparatus for identifying a stage of development of a subject, the apparatus comprising:

the determining module is used for determining whether the object to be identified is in the preset development stage or not according to the information obtained by identification;

the second obtaining module is specifically configured to:

acquiring a search word for searching the object to be identified;

12. A computer arrangement comprising a memory, a processor and a computer program stored on the memory and executable on the processor, when executing the program, implementing a method of identifying a stage of development of an object as claimed in any one of claims 1 to 10.

13. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method of identifying a stage of development of an object according to any one of claims 1 to 10.