CN113626624B

CN113626624B - Resource identification method and related device

Info

Publication number: CN113626624B
Application number: CN202111184988.8A
Authority: CN
Inventors: 刘刚
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-10-12
Filing date: 2021-10-12
Publication date: 2021-12-21
Anticipated expiration: 2041-10-12
Also published as: CN113626624A

Abstract

The embodiment of the application discloses a resource identification method and a related device, relates to machine learning, deep learning and the like in artificial intelligence, and can be applied to the fields of cloud technology, artificial intelligence, intelligent traffic, auxiliary driving and the like, a high-quality mark is determined based on an influence parameter of a mark in a second platform, and the high-quality mark is used as a basis for judging whether a target mark is a potential high-quality mark. And constructing a target identification vector according to the first identification attribute information of the target identification and the first object interaction information corresponding to the resource issued by the target identification on the first platform, and constructing a high-quality identification vector according to the second identification attribute information of the high-quality identification and the second object interaction information corresponding to the resource issued by the high-quality identification on the second platform. If the similarity matching between the high-quality identification vector and the target identification vector is successful, the target identification is the potential high-quality identification which belongs to the same resource field with the high-quality identification, the checking sequence of the resource to be checked in the first platform is promoted, and the resource to be checked is prevented from being overstocked.

Description

Resource identification method and related device

Technical Field

The present application relates to the field of computer technologies, and in particular, to a resource identification method and a related apparatus.

Background

In the age of rapid development of the internet, many websites allow users to upload resources such as videos, characters, pictures and the like for display. As the threshold of resource production is reduced, the uploading amount of resources rapidly increases exponentially, and in order to ensure the security of resource distribution, it is necessary to complete the auditing of resources in a short time, for example, whether sensitive content is involved in the resources, and to identify and process the quality, security, and the like of the resources.

At present, the resource auditing mainly depends on manual auditing, so that a large amount of labor cost is consumed, and the auditing efficiency is low. In the related technology, a machine learning algorithm and an understanding algorithm are used for assisting, resources which obviously violate laws are filtered out through the machine learning algorithm, then the unfiltered resources are identified through the understanding algorithm, and if the resources belong to a title party, the content description of the resources is not objective, and the like, the auditor continues to audit according to the time sequence by combining the identification results.

However, because the manual review speed is lower than the uploading speed of the resources, the method still causes serious backlog of the resources, and easily causes that some high-quality resources cannot be quickly reviewed and displayed to the public, so that the viewing experience of the user is poor.

Disclosure of Invention

In order to solve the technical problem, the application provides a resource identification method and a related device, which are used for identifying a resource uploaded by a potential high-quality identifier, improving the auditing sequence of the resource, avoiding backlog of the high-quality resource and improving the viewing experience of a user.

The embodiment of the application discloses the following technical scheme:

in one aspect, an embodiment of the present application provides a resource identification method, where the method includes:

acquiring first identification attribute information of a target identification of a resource to be audited in a first platform and first object interaction information corresponding to the resource issued by the target identification on the first platform, and acquiring second identification attribute information of a high-quality identification in a second platform and second object interaction information corresponding to the resource issued by the high-quality identification on the second platform, wherein the high-quality identification is determined based on an influence parameter identified in the second platform;

constructing a target identification vector of the target identification according to the first identification attribute information and the first object interaction information, and constructing a high-quality identification vector of the high-quality identification according to the second identification attribute information and the second object interaction information;

if the similarity matching between the high-quality identification vector and the target identification vector is successful, determining that the target identification is a potential high-quality identification which belongs to the same resource field as the high-quality identification in the first platform, and promoting the auditing sequence of the resource to be audited in the first platform.

In another aspect, an embodiment of the present application provides a resource identification apparatus, where the apparatus includes: the device comprises an acquisition unit, a construction unit and an execution unit;

the acquisition unit is used for acquiring first identification attribute information of a target identification of a resource to be audited in a first platform and first object interaction information corresponding to the resource issued by the target identification on the first platform, and acquiring second identification attribute information of a high-quality identification in a second platform and second object interaction information corresponding to the resource issued by the high-quality identification on the second platform, wherein the high-quality identification is determined based on an influence parameter identified in the second platform;

the constructing unit is configured to construct a target identification vector of the target identification according to the first identification attribute information and the first object interaction information, and construct a high-quality identification vector of the high-quality identification according to the second identification attribute information and the second object interaction information;

the execution unit is configured to determine that the target identifier is a potential high-quality identifier in the first platform and belonging to the same resource field as the high-quality identifier if the similarity matching between the high-quality identifier vector and the target identifier vector is successful, and promote an audit sequence of the resource to be audited in the first platform.

In another aspect, an embodiment of the present application provides a computer device, where the computer device includes a processor and a memory:

the memory is used for storing program codes and transmitting the program codes to the processor;

the processor is configured to perform the method of the above aspect according to instructions in the program code.

In another aspect, the present application provides a computer-readable storage medium for storing a computer program for executing the method of the above aspect.

In another aspect, embodiments of the present application provide a computer program product or a computer program, which includes computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method of the above aspect.

According to the technical scheme, in the first platform, whether the resource to be audited is a high-quality resource can be identified by determining whether the target identifier of the uploaded resource to be audited is a potential high-quality identifier. And determining a high-quality identifier based on the influence parameters of the identifier in the second platform, and using the high-quality identifier as a basis for judging whether the target identifier is a potential high-quality identifier. And constructing a target identification vector according to the first identification attribute information of the target identification and the first object interaction information corresponding to the resource issued by the target identification on the first platform, and constructing a high-quality identification vector according to the second identification attribute information of the high-quality identification and the second object interaction information corresponding to the resource issued by the high-quality identification on the second platform. The account attribute information can clarify the resource field of the account, and the object interaction information can mine the similar characteristics of the account aiming at high-quality dimensionality in the user interaction level. If the similarity matching between the high-quality identification vector and the target identification vector is successful, the target identification is the potential high-quality identification which belongs to the same resource field with the high-quality identification, the auditing sequence of the resource to be audited in the first platform can be promoted, the resource to be audited which possibly belongs to the high-quality resource can be audited quickly, the high-quality resource is prevented from being overstocked, the time consumption for auditing the high-quality resource is reduced, the overall performance of resource auditing is improved, and the viewing experience of a user is improved. Moreover, the identification vectors are constructed through multi-dimensional information and similarity matching is carried out, so that the account matching speed is improved, the matching range is reduced, and the matching precision is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic view of an application scenario of a resource identification method according to an embodiment of the present application;

fig. 2 is a flowchart of a resource identification method according to an embodiment of the present application;

fig. 3 is a schematic diagram of generating an identification vector according to an embodiment of the present application;

fig. 4 is a schematic diagram of a resource processing link according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a crawling system provided by an embodiment of the present application;

fig. 6 is a schematic structural diagram of a resource identification system according to an embodiment of the present application;

fig. 7 is a schematic diagram of a resource identification apparatus according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of a server according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of a terminal device according to an embodiment of the present application.

Detailed Description

Embodiments of the present application are described below with reference to the accompanying drawings.

As the uploading amount of information rapidly increases in an exponential speed, even the auditing method in the related art still causes the backlog of resources. Especially for high-quality resources in the backlogged resources, such as resources uploaded by a known account, resources forwarded by other platforms for more than ten thousand times, resources shared by other platforms for more than five thousand times, and the like, if the high-quality resources in the platform cannot be quickly audited, the high-quality resources cannot be quickly distributed, so that the viewing experience of a user is influenced.

Based on this, the embodiment of the application provides a resource identification method and a related device, which are used for identifying a resource uploaded by a potential high-quality identifier, improving the auditing sequence of the resource, avoiding backlog of the high-quality resource, and improving the viewing experience of a user.

The resource identification method provided by the embodiment of the application is realized based on Artificial Intelligence (AI), which is a theory, method, technology and application system for simulating, extending and expanding human Intelligence by using a digital computer or a machine controlled by the digital computer, sensing the environment, acquiring knowledge and obtaining the best result by using the knowledge. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and the like.

In the embodiment of the present application, the artificial intelligence software technology mainly involved includes the directions of computer vision, machine learning/deep learning, and the like.

The resource identification method provided by the application can be applied to resource identification equipment with data processing capacity, such as terminal equipment and servers. The terminal device may be, but is not limited to, a smart phone, a desktop computer, a notebook computer, a tablet computer, a smart speaker, a smart watch, a smart television, a smart voice interaction device, a smart home appliance, a vehicle-mounted terminal, and the like; the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud computing services. The terminal device and the server may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein. The embodiment of the invention can be applied to various scenes including but not limited to cloud technology, artificial intelligence, intelligent traffic, driving assistance and the like.

The resource identification device can have Computer Vision capability, and Computer Vision technology (Computer Vision, CV) is a science for researching how to make a machine see, and further means that a camera and a Computer are used for replacing human eyes to carry out machine Vision such as identification, tracking and measurement on a target, and further graphic processing is carried out, so that the Computer processing becomes an image more suitable for human eyes to observe or is transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. The computer vision technology generally includes image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technology, virtual reality, augmented reality, synchronous positioning and map construction, automatic driving, intelligent transportation and other technologies, and also includes common biometric identification technologies such as face recognition and fingerprint recognition.

The resource identification device can have Machine Learning capacity, Machine Learning (ML) is a multi-field cross subject, and relates to multiple subjects such as probability theory, statistics, approximation theory, convex analysis and algorithm complexity theory. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

In the resource identification method provided by the embodiment of the application, the adopted artificial intelligence model mainly relates to the application of machine learning, and the target identification vector and the high-quality identification vector are generated through deep learning technology and the like in the machine learning and are matched.

In order to facilitate understanding of the technical solution of the present application, a server is taken as a resource identification device to introduce the resource identification method provided in the embodiments of the present application in combination with an actual application scenario.

Referring to fig. 1, the figure is a schematic view of an application scenario of a resource identification method provided in an embodiment of the present application. In the application scenario shown in fig. 1, the terminal device 110 and the server 120 are included, where the server 120 is configured to generate an identification vector, and improve an audit sequence of uploaded resources based on similarity matching.

In practical applications, the user may publish the resource in the first platform using the registered target identifier with the terminal device 110. The identifier represents some contents such as numbers and symbols that the user can represent in one platform, and the target identifier represents the contents that the user represents in the first platform. The high-quality identifier is the content of the user represented by the user in other platforms such as the second platform, and most resources issued by the user on the other platforms through the high-quality identifier are high-quality resources, and this embodiment takes a high-quality account as an example. The resource is, for example, a video, a text, a picture, etc., and the resource is, for example, a video in this embodiment. The server 120 obtains the video uploaded by the user through the network, and displays the video to be audited on the first platform after the video to be audited passes the audit.

In order to avoid overstocking of the high-quality video in the first platform and influence on the viewing experience of the user, whether the video to be audited is the high-quality video can be identified by determining whether the target account of the uploaded video to be audited is the potential high-quality account.

Whether the target account is a potential high-quality account is mined, other platforms, such as a high-quality account in a second platform, can be used as a comparison standard, if the target account of the first platform is similar to the high-quality account of the second platform, the target account may be the potential high-quality account, and if the target account may be a small number of the high-quality account of the second platform on the first platform, the target account may be the small number of the first platform. Wherein the premium account number may be determined based on the impact parameters identified in the second platform.

The first identifier attribute information is identifier attribute information corresponding to a target identifier on a first platform, and the second identifier attribute information is identifier attribute information corresponding to a high-quality identifier on a second platform. The identification attribute information represents information such as the category and the label of the identification, so that the identified resource field can be defined, and the similarity between the identifications belonging to the same resource field has significance. The object interaction information represents interaction behaviors between the identification and the user, for example, the user pays attention to an account, browsing, playing, commenting, collecting, forwarding, sharing, praising and the like are performed on a resource user publishing the account, similar characteristics of the identification aiming at high-quality dimensionality in a user interaction layer can be mined, a large number of user similar interaction behaviors exist between similar identifications, and for example, the ratio of the number of forwarding resources to the number of commenting resources is in a certain interval.

The server 120 generates a target identification vector according to the first account attribute information of the target account and the first user interaction information corresponding to the resource issued by the target account on the first platform, and constructs a high-quality identification vector according to the second account attribute information of the high-quality account and the second user interaction information corresponding to the resource issued by the high-quality account on the second platform.

If the similarity matching between the high-quality identification vector and the target identification vector is successful, for example, the similarity between the target identification vector and the high-quality identification vector is 90%, which indicates that the target account is a potential high-quality account belonging to the same resource field as the high-quality account, and the high probability of the video uploaded by the potential high-quality account is a high-quality video, so that the auditing sequence of the video to be audited in the first platform can be promoted, for example, the 4 th video to be audited, which should be sequenced in the queue to be audited, is promoted to the 1 st video in the queue to be audited.

Therefore, videos to be audited, which possibly belong to high-quality videos, in the first platform can be audited quickly, the high-quality resources in the first platform are prevented from being overstocked, time consumption for auditing the high-quality videos is reduced, the overall performance of video auditing is improved, and viewing experience of users is improved. Moreover, the identification vectors are constructed through multi-dimensional information and similarity matching is carried out, so that the account matching speed is increased, matching is carried out in one resource field, the matching range is reduced, and the matching precision is improved.

A resource identification method provided in the embodiment of the present application is described below with reference to the accompanying drawings and using a server as a resource identification device.

Referring to fig. 2, the figure is a flowchart of a resource identification method according to an embodiment of the present application. As shown in fig. 2, the resource identification method includes the following steps:

s201: the method comprises the steps of obtaining first identification attribute information of a target identification of a resource to be checked in a first platform and first object interaction information corresponding to the resource issued by the target identification on the first platform, and obtaining second identification attribute information of a high-quality identification in a second platform and second object interaction information corresponding to the resource issued by the high-quality identification on the second platform.

In practical applications, a user may upload resources in a platform through an account registered in the platform by using a terminal program and/or a server-side program. For example, a user uploads a resource in a first platform through a target identifier, and for the first platform, the resource is a resource to be audited and is recommended to the user in the first platform for viewing after the audit is passed.

Since the uploading amount of the resources rapidly increases at an exponential speed and the auditing resources are limited, the resources to be audited are overstocked. If the resource to be audited is a high-quality resource, for the user uploading the resource to be audited, the resource of the user cannot be recommended to the user in the first platform at a later time, the high-quality resource may not be uploaded on the first platform any more, so that the high-quality resource in the first platform runs off, and then the first platform runs off the user who likes to view the high-quality resource, so that the problems of reduced user viscosity, reduced user use time and the like occur.

Based on this, it is necessary to determine whether the resource to be audited is a high-quality resource, so as to avoid backlog of the high-quality resource. By judging whether the target identifier of the uploaded resource to be audited is a high-quality identifier or a potential high-quality identifier, the possibility that the high-quality identifier or the potential high-quality identifier uploads the high-quality resource is high, and if the target identifier is the high-quality identifier or the potential high-quality identifier, the resource to be audited can be determined to be the high-quality resource.

The high-quality identification can be determined by the influence parameter of the identification, taking the identification as an account as an example, the influence parameter of the account can be determined by two aspects, on one hand, such as the number of people paying attention to the account, the awareness of the account owner, account ranking and other prior information related to the account, on the other hand, such as the number of prawns, the number of forwarding resources, the number of resource reading and other account uploading resources are recommended, namely, the object interaction information interacted with the user after the resources are issued, wherein the object interaction information is the content of interaction generated among objects, when the object is the user, the object interaction information can be that the resources are prawned for more than five thousand times, the resources are forwarded for more than ten thousand times and the like, and the specific numerical value can be determined by the platform according to the service, which the application is not specifically limited. It should be noted that the account information, the user interaction information, and other content related to the user, which are acquired in the present application, all obtain the consent of the user or the account owner.

The potential high-quality identifier is an account number which can become a high-quality identifier, and compared with the high-quality identifier, the potential high-quality identifier is still in a growth stage, the number of uploaded or released resources is small, and the potential high-quality identifier is difficult to determine through an influence parameter. Particularly, in the case that there is little or no data that can be used as a reference in the first platform, it is not clear whether the target identifier is a potential high-quality identifier, and if the judgment is performed after the potential high-quality identifier "grows", the high-quality resource loss is caused by the judgment delay. Therefore, whether the target identifier is a potential high-quality identifier can be determined through other platforms, such as the high-quality identifier in the second platform.

The quality indicator in the second platform may be determined based on the impact parameter identified in the second platform. The embodiment of the application does not specifically limit the manner of obtaining the influence parameter of the identifier in the second platform, and takes the identifier as an account number as an example, the relevant information capable of determining the influence parameter of the account number in the second platform can be obtained through the crawling rule, and the crawling rule can be set through the service condition of the second platform, such as which websites and which resource sources are crawled at the client side. The crawling rule can be set by a crawling system, and is described later with reference to fig. 5, which is not described herein again.

The first platform and the second platform are different platforms, and users can upload, browse and share media aiming at resources. The presentation forms of the resources in the platform include, but are not limited to, articles, pictures, and videos. The article may include any one or more combinations of pictures and videos, the videos include vertical videos and horizontal videos, and the user may upload the videos on the platform through the registered account and provide the videos to other users on the platform in the form of Feeds stream for viewing.

It should be noted that Feeds, information providers, manuscripts, abstracts, sources, news subscriptions, web Feeds (english: web Feeds, news Feeds, and synthesized Feeds) are a data format through which websites can transmit the latest information to users, and are usually arranged in a Timeline (Timeline), which is the most primitive and basic presentation form of Feeds. A prerequisite for a user to be able to subscribe to a website is that the website provides a source of messages. The confluence of Feeds is called polymerization (aggregration), and the software used for polymerization is called aggregator (aggregator). For the end user, the aggregator is software dedicated to subscribe to the website, and is also commonly referred to as RSS Reader (Rich Site Summary Reader), feed Reader, news Reader, etc.

S202: and constructing a target identification vector of the target identification according to the first identification attribute information and the first object interaction information, and constructing a high-quality identification vector of the high-quality identification according to the second identification attribute information and the second object interaction information.

In the related technology, the identifier is taken as an account number as an example, and character string matching is performed through the name of the account number and account number registration information, but even if the information is the same resource uploader, the information may have differences in different platforms, for example, a registered account number of the user a in the first platform is called an "account number of the user a", and an account number registered by the user a in the second platform is called "the account number belongs to the user a". The account names are different, so that the matching effect is poor.

Based on this, matching can be performed using an identification vector characterizing the identification characteristics. Acquiring first identification attribute information of a target identification of a resource to be checked in a first platform, first object interaction information corresponding to the resource issued by the target identification on the first platform, second identification attribute information of a high-quality identification in a second platform and second object interaction information corresponding to the resource issued by the high-quality identification on the second platform.

The identification attribute information can be resource classification, resource labels and the like, and account numbers of resource fields (such as science and technology fields, human fields and the like) corresponding to different account numbers and conditions of corresponding categories (such as new heat, explosive money and local) can be screened out, so that the identified resource fields are determined, and the similarity between the account numbers belonging to the same resource field has significance. For example, the user interaction information represents interaction behaviors between the account and the user, for example, the user pays attention to the account, and browses, plays, reviews, collects, forwards, shares, approves and the like for the account publishing resource user, and similar characteristics of the account with respect to high-quality dimensions in a user interaction layer can be mined.

And aiming at the account in the first platform, constructing a target identification vector of the target identification according to the first identification attribute information and the first object interaction information, wherein the target identification vector can represent the high-quality characteristics of the target identification in the corresponding resource field. And aiming at the account in the second platform, constructing a high-quality identification vector of the high-quality identification according to the second identification attribute information and the second object interaction information, wherein the high-quality identification vector can represent the high-quality characteristics of the high-quality identification in the corresponding resource field.

In order to improve the capability of the identifier vector to embody the high-quality characteristic, the first content of the resource to be audited and the second content of the high-quality identifier release resource can be obtained. And constructing a target identification vector of the target identification according to the first identification attribute information, the first object interaction information and the first content, and constructing an identification vector of the high-quality identification according to the second identification attribute information, the second object interaction information and the second content.

Therefore, the identification vector is constructed according to the account attribute information and the object interaction information, and the identification vector can be constructed by continuously combining the content, such as fusing the content vector corresponding to the content in the identification vector. The high-quality characteristics of the account are fully mined through the content dimension, the identification vectors are enriched, the matching range is further reduced, and the matching precision of the subsequent identification vectors is improved.

S203: if the similarity matching between the high-quality identification vector and the target identification vector is successful, determining that the target identification is a potential high-quality identification which belongs to the same resource field with the high-quality identification in the first platform, and promoting the auditing sequence of the resource to be audited in the first platform.

If the similarity matching between the high-quality identification vector and the target identification vector is successful, for example, metric learning is adopted, the distance between the high-quality identification vector and the target identification vector is calculated, for example, cosine similarity, and when the cosine similarity meets a preset condition, if the similarity is greater than 80%, the similarity matching between the high-quality identification vector and the target identification vector is considered to be successful, the target identification and the high-quality identification are indicated to be identical or similar, the target identification and the high-quality identification belong to the same resource field, and the target identification can be determined as potential high-quality identification in the first platform.

The potential high-quality identifier in the first platform is similar to the high-quality identifier in the second platform in high-quality dimension, the potential high-quality identifier can become a high-quality identifier in the first platform as time goes on, and resources uploaded on the first platform can be high-quality resources. Therefore, in order to avoid overstocking of the high-quality resources, the auditing sequence of the resources to be audited uploaded by the target identifier in the first platform can be improved, the auditing efficiency of the high-quality resources is accelerated, and the time for the high-quality resources to pass through the resource processing link is reduced.

By identifying the high-quality resources in the resources to be audited, the auditing sequence of the high-quality resources is promoted, which means that the resources to be audited are known to be high-quality resources and are arranged at the head of the auditing queue, more high-quality resources can be audited under the condition of the same auditing cost and manpower, and the auditing amount required by the method provided by the embodiment of the application is reduced compared with the method for auditing the resources to be audited according to the time sequence in the related art according to the auditing amount consumed by starting the same number of high-quality resources. Meanwhile, for a recommendation system, the recommendation capability of the system is limited, and not all contents have exposure opportunities, the method provided by the embodiment of the application can improve the proportion of high-quality resources in the recommendation pool, reduce the auditing cost and increase the exposure rate of the high-quality resources.

Meanwhile, the starting rate and the starting amount of the potential high-quality identifiers can be increased, wherein the starting rate is the ratio of the number of resources which pass the audit and the total number of the audit, and the starting amount is the number of the resources which pass the audit. If the quality resources are audited quickly, which is equivalent to that the ratio of the quality resources is increased at the head of the audit queue, the number of the resources passing the audit is increased, and the time consumption of the quality resources in a resource processing link is reduced.

As a possible implementation manner, under the condition that an identification vector is generated based on account attribute information, object interaction information, and content, if the similarity matching between a high-quality identification vector and a target identification vector is successful, and a resource to be audited uploaded by a target identification on a first platform is the same as a resource already issued by the high-quality identification on a second platform, or the resource to be audited is already issued by the high-quality identification on the second platform, the target identification and the high-quality identification are homologous identifications, that is, a user registering the target identification on the first platform and a user registering the high-quality identification on the second platform are the same person. At this time, since the resource to be audited has already been issued by the second platform, or has already been audited by the second platform, the first platform may adopt a first issue and then audit mechanism, that is, the resource to be audited is issued first, and then the audit is performed subsequently.

The method and the device for auditing the issue and the review are not particularly limited, and for example, the review can be performed when the manual review is not heavy. For another example, comment content after the resource to be audited is published is collected, emotion mining and analysis are performed on the comment content, and manual review or withdrawal processing and the like are performed on the resource with problems from the first platform. For another example, if the resource to be audited is filtered before and becomes a disabled resource, the resource to be audited may be re-enabled, because the second platform can issue the resource to be audited, the security level of the resource to be audited is higher.

The embodiment of the present application is not particularly limited to the manner of generating the tag vector, and the following description will take the generation of the target tag vector as an example, specifically refer to S2021-S2023.

S2021: and acquiring an identifier association relation for reflecting association between the first platform identifiers.

A large amount of object interaction information common to users exists among the similar identifications, and if a large amount of users pay attention to or click on the account A and immediately pay attention to or click on the account B after paying attention to or clicking on the account A, the account A and the account B have similarity, so that the high-quality characteristics of the identifications can be further mined through the identification association relationship among the similar identifications.

The expression form of the identification incidence relation is not particularly limited in the embodiment of the application, for example, the identification incidence relation may be a weighted directed graph among user accounts, where nodes represent accounts, edges between nodes represent weights, and the weights may be determined by the number or type of user interaction behaviors. For example, the determination may be in terms of the total number of common interest account numbers. For another example, the weights corresponding to different types of user interaction behaviors may be set according to the cost of the user initiating one user interaction behavior in the business, and may be set by the platform, generally speaking, commenting on a resource requires the user to perform an input operation, and the cost is the highest.

It should be noted that only a certain object interaction behavior may be used alone, and other kinds of object interaction behaviors are not used, so that weighted directed graphs of different versions are constructed, and account vectors, that is, identification vectors, are described in more detail. The version refers to a version of an identification vector model, and may be obtained by training sample data at different time periods, or an identification vector for image-text and video contents, which is constructed based on an image-text or video account, and this is not specifically limited in this application.

Referring to fig. 3, this figure is a schematic diagram of generating an identification vector according to an embodiment of the present application. In fig. 3, account vectors are generated according to user interaction information sequences of three users in the first platform, for example, a weighted directed graph between accounts is constructed through the user interaction information sequence of the user U1, taking the user interaction information as an account of interest as an example, the user interaction information sequence of the user U1 is: paying attention to the D account, paying attention to the A account, and paying attention to the B account, the user U2 and the user U3 have the same reason.

It should be noted that if the whole user interaction information sequence of the user is used, the consumption of calculation and space storage resources is huge, and the user interest is changed in a long time, but the user interest in a short time is the same. Based on this, the user interaction information sequence of the user needs to be cut, for example, at intervals of a preset time length (for example, ten minutes), if the user U2 pays attention to the E account and pays attention to the D account after more than 10 minutes, the cutting is performed, and the user attention E account and the attention D account do not belong to the same user interaction information sequence. Thus, a directed weighted graph may be generated based on a sequence of user interaction information for three users.

S2022: and determining a sub-incidence relation of the target identification in the identification incidence relation according to the first object interaction information.

The identifier association relationship includes sub-association relationships corresponding to the plurality of identifiers, and the sub-association relationships corresponding to the target identifiers can be determined from the identifier association relationships based on the first object interaction information.

As a possible implementation manner, when the number of object interaction behaviors in the platform is large, edge-based weight sampling (weighted walk) can be adopted, the calculation amount of the graph can be reduced and controlled through sampling, and the situation that the calculation amount is increased in a geometric progression manner when the number of nodes and edges is large is avoided. In the sampling process, random walk is performed, for example, the random walk is performed in the direction of a hot node (a node with a larger edge weight) as much as possible, the data associated with the hot node is more, the data associated with a non-hot node is less, the influence of error clicking is great, and the confidence of a sample sampled by the random walk in the direction of the hot node is higher.

S2023: and generating a target identification vector of the target identification based on the first identification attribute information and the sub-incidence relation.

After obtaining the sub-association relationship of the target identifier, first identifier attribute information (side-info) including, but not limited to, resource classification, resource label may be fused. For example, an account name is "science and technology universe", the corresponding resource classification is science and technology, and the resource label may be a label with the largest number among labels corresponding to a plurality of resources (e.g., science and technology articles) issued by the account. Even a target identification vector of the target identification can be generated by fusing the content.

Continuing with fig. 3, a plurality of user interaction information sequences are sampled by walking towards the hot node direction in the weighted directed graph. And then according to account prior characteristics corresponding to the resources, such as first account attribute information (resource category, resource label) and even content, fusing to generate account vectors of the accounts in the first platform, wherein the account vectors comprise target account vectors of the target accounts.

The embodiment of the present application does not specifically limit the fusion manner, for example, the fusion manner uses a Deep walking (Deep Walk) & Skip-code (Skip-gram) algorithm to construct the identification vector. Wherein the Deep Walk algorithm is a combination of a Random Walk (Random Walk) algorithm and a Skip-gram algorithm. The Random Walk algorithm is responsible for sampling the weighted directed graph to obtain the adjacency relation between the nodes in the weighted directed graph, and the Skip-gram algorithm trains from the sampled sequence to obtain an identification vector, namely, the next node is predicted through an object interaction information sequence. For example, for a sequence a- > b- > e- > f, the sliding window is 2, a and b are input, and the prediction result is e; b and e are input, and the prediction result is f.

The method for improving the quality resource auditing sequence is not particularly limited in the embodiment of the application, for example, the resource to be audited determined as the quality resource is directly improved to the first position of the auditing queue. The following description will be given by taking an example as an example.

When account matching is performed, high-quality identifications of multiple platforms can be obtained to form a high-quality identification set, in the high-quality identification set, the high-quality degrees of different high-quality identifications are different, and the high-quality degree can be determined according to the influence parameters of the high-quality identifications and the importance parameters of the high-quality identifications on the platforms where the high-quality identifications are located, for example, the influence of the high-quality identification a is high, the importance of the platform where the high-quality identification a is located is high for a first platform, and the high-quality degree of the high-quality identification a is high. Thereby, the plurality of high-quality identifications can be sorted in the high-quality identification set according to the high-quality degree. After the target identifier is successfully matched with the high-quality identifier, the auditing sequence of the resource to be audited in the first platform can be promoted according to the high-quality degree of the high-quality identifier.

For example, the first platform has a plurality of auditing queues of different levels, and if the quality degree of the resource to be audited is high, the resource to be audited can be inserted into the high-priority auditing queue, and can also be inserted into different positions according to different quality degrees.

For the first platform, potential quality identification is needed to improve the public praise of the user, and key indexes such as user stickiness and use duration are improved. When the resources are recommended, the resources are distributed unevenly, namely, a small number of recommended resources occupy a large amount of exposure, and if the small number of recommended resources are high-quality resources, the potential high-quality identification plays a more prominent role, so that the reading experience of a user can be improved, and the stickiness can be increased. Particularly, related data are less under the cold start condition, and by determining the potential high-quality identifier, distribution weights of the potential high-quality identifier, all resources uploaded by the potential high-quality identifier, part of high-quality resources and the like can be increased, exposure weights of the resources uploaded by the potential high-quality identifier are increased, and key indexes such as user stickiness, use duration and the like are further improved.

Thus, the occupancy of the potential super identity at the first platform can be quantified. The method comprises the steps of obtaining a first number of potential high-quality marks in the first platform, belonging to the field of target resources, and a second number of all account numbers in the first platform, belonging to the field of target resources, and if the ratio of the first number to the second number is smaller than a ratio threshold, improving the exposure weight of the potential high-quality marks in the field of target resources when resources are recommended.

For example, for the field of scientific and technological resources, the potential high-quality identifiers occupy a small number of accounts belonging to the field of scientific and technological resources, and the potential high-quality identifiers belonging to the field of scientific and technological resources can be given higher exposure weights when the resources are recommended, so that the resources to be audited uploaded by the potential high-quality identifiers are exposed as much as possible. If the high-quality resources in the field of the target resources are less, the key indexes such as user viscosity, use duration and the like are improved by exposing the existing high-quality resources more as much as possible.

Referring to fig. 4, this figure is a schematic diagram of a resource processing link according to an embodiment of the present application. And after the user uploads the resource to be audited in the first platform through the registered identifier, the resource to be audited is stored in a warehouse for auditing. The auditing can be divided into machine auditing and manual auditing, obviously illegal resources are filtered through the machine auditing, the resources are audited through the manual auditing, such as a first-sending and later-auditing mechanism, and the resources to be audited which pass the auditing can be recommended. Meanwhile, in order to ensure that the arrangement of the resource content quality is controllable, a point-of-presence (POP) monitoring mechanism can be adopted, emotion mining and analysis are carried out on the comment content of the resource by a user, and the resource with problems is pushed to the manual work for rechecking.

Therefore, the information of the high-quality identification of the second platform is crawled to generate a high-quality identification vector, and the high-quality identification vector is matched with the identification vector of the resource to be audited, so that the resource to be audited, which is identified as the high-quality resource, is not overstocked or filtered by the resource processing link as far as possible. For example, the scheduling policy of the resource processing link may be adjusted according to the state of the resource (an audit queue in which the machine audit mode is located, an audit queue in which the manual audit mode is located, a disabled state, or the like) matched, and if the resource is in the audit queue in which the machine audit mode is located, the resource is inserted into a high-priority queue processed by the machine; if the verification queue is in the verification queue of the manual verification mode, the verification is accelerated manually; if in the disabled state, re-enabling may be performed. And finally, the optimization of the supply efficiency of the resource processing link is realized, and the accelerated distribution of high-quality resources is finally realized.

If the high-quality identifiers in the second platform are not successfully matched in the first platform, for example, an identifier vectorization index library may be constructed, which includes identifier vectors of identifiers to which a plurality of resources to be checked belong respectively, and if matching between the high-quality identifier vectors and all the identifier vectors in the identifier vectorization index library fails, it is indicated that the identifiers (such as high-quality account numbers and the like) are not introduced by the first platform. For example, resources can be opened and released by introducing content from a content introduction source (e.g., from a media author or a content producer), such as by Business Development (BD), attracting high-quality identified authors, obtaining resources directly released by the authors, and enriching the supply of high-quality resources in the platform repository.

As a possible implementation manner, a third number of potential high-quality identifiers in the first platform and a fourth number of identifier vectors used by the first platform for similarity matching may also be obtained within a target time period, such as a day, a week, a month, and the like, and if a ratio of the third number to the fourth number is smaller than a coverage threshold, a prompt for introducing a high-quality identifier is sent. And if the coverage rate is low, the number of the potential quality marks in the first platform is small, and the quality resources are few, so that the work can be focused on the introduction of the quality resources. As a possible implementation manner, if the coverage rate is high, which indicates that the first platform has a large number of potential premium identifiers, the work may be focused on speeding up the review of premium resources.

The resource processing link is equivalent to a huge funnel, a part of the entered resources can be found out, and aiming at the funnel reasons of machine audit and manual audit, if the content which is not leaked out is filtered out due to any reason, such as a title party, no nutrition, advertisement content and the like, the accuracy of resource matching can be improved, the mistaken killing of the resource processing link is reduced, namely the resource is good in the appearance performance of the second platform, but is filtered out on the first platform, and the distribution opportunity is not started.

Meanwhile, the similarity duplicate removal capability can be optimized, the similarity calculation accuracy is improved, and the proportion of the error of 'duplicate removal of resource processing link' of the high-quality resources is reduced; rechecking the account number which is sent first and then checked, and reducing the time consumption of link processing by adopting the capacity of sending first and then checking, or distributing higher checking priority by accelerating manual checking and reducing the time consumption of link processing; and aiming at the re-started resources to be audited, corresponding content marks are added, the recommendation weight of the potential high-quality identification is increased, the cold-start weighted exposure is carried out on the recommendation side, the distribution effect and the priority of the whole high-quality identification are improved, and the resources uploaded by the high-quality identification can be started and accelerated to be distributed in a shorter time delay.

Referring to fig. 5, the figure is a schematic diagram of a crawling system provided in an embodiment of the present application. The crawling system 500 includes a component unit 510, an intelligent algorithm unit 520, a crawler engine unit 530, a scheduling service unit 540, and a visualization configuration platform 550.

The component unit 510 modularizes components for cracking different platforms to realize one-key access, and as shown in fig. 5, may include an a-platform cracking component 511, a B-platform cracking component 512, a C-platform cracking component 513, and a D-platform cracking component 514.

The intelligent algorithm unit 520 is used for automatically identifying relevant information of resources, such as article titles, article contents and the like, and may include a list identification 521, a resource title identification 522, a resource content identification 523 and a resource link extraction 524.

The crawler engine unit 530 is used to support crawling of different platform information and may include puppeteer instance 531, browser instance 532, pre-processing 533, login state maintenance 534, anti-headline (header) detection 535, anti-behavior detection 536, page crawling 537, and agent pool 538. Puppieeer is a node.js package published by the Chrome development team of Google (Google) in 2017 and used for simulating the operation of a Chrome browser. The provided Application Programming Interface (API) can conveniently control the browser, and realize crawler Application, website screenshot, website PDF generation and the like. Two main ways are provided: no interface (header), bounded surface (FullHead), both of which differ in the request header and rendering style (e.g., header information detected by the website, and rendering environment). The crawler engine unit 530 has certain anti-crawling capability (e.g., using bringing IP pools (referring to the maximum IP segments that a certain operator can provide) and controlling the crawling frequency) to simulate the login of an actual user. In order to improve the crawling efficiency, a plurality of crawling tasks can be established in a multi-thread and multi-service mode, so that a set of crawling automatic task management service provides scheduling.

The scheduling service 540 is used for automatic management of user tasks and may include distributed management 541, task scheduling 542, machine monitoring 543, and task keep alive 544.

The visualization configuration unit 550 is used for visualizing zero development, and may include a crawling rule setting 551, a crawling type setting 552, a task information setting 553, a crawling policy setting 554, and a crawling platform setting 555. The related personnel can configure the crawling policy and the crawling rule directly through the visualization configuration unit 550.

In order to better understand the resource identification method provided by the embodiment of the application, the embodiment of the application also provides a resource identification system. The resource identification system provided by the embodiment of the present application is described below.

Referring to fig. 6, this figure is a schematic structural diagram of a resource identification system according to an embodiment of the present application. As shown in fig. 6, the resource identification system includes a content production end 601, a content consumption end 602, an uplink and downlink content interface server 603, a recommended distribution and content distribution export service 604, a content database 605, a scheduling center 606, a manual review system 607, a statistical reporting interface and analysis service 608, a potential high-quality identifier accelerated scheduling service 609, a machine review system 610, a network high-quality identifier library 611, an identifier vectorization matching service 612, and a network crawling and parsing service 613, which are described below.

The content producing end 601 is configured to:

(1) a Content producer of Professional Generated Content (PGC), User Generated Content (UGC), Multi-Channel Network (MCN) or Professional User Generated Content (pupc) provides, through a mobile terminal or Application Programming Interface (API) system, a source of local or global Wide area Network (Web) distribution system, such as graphic Content or uploaded video Content, including short video and small video, which are hereinafter referred to as Content, and are the main Content sources for distributing Content;

(2) through communication with the uplink and downlink content interface server 603, the interface address of the upload server is obtained first, and then the content is distributed.

The content consumption end 602 is configured to:

(1) as a consumer, the system communicates with the uplink and downlink content interface server 603 to obtain index information of access content, then communicates with the uplink and downlink content interface server 603 and the content export service 604 to directly consume content, and the content index is obtained by feed recommendation and distribution on the premise of consumption;

(2) through a built-in Feeds and a user clicking behavior and environment reporting module, collecting the current network environment of the user, the user clicking operation behavior on the Feeds intermediate information and exposure data of Feeds content, and reporting to a statistical reporting interface and analysis service 608;

(3) if the video content reports that the video is played for too long, the video content is cached for too long, user interaction behaviors such as comment, forwarding, sharing, collecting and praise are carried out, and negative behaviors such as reporting and negative feedback behaviors are carried out.

The uplink and downlink content interface server 603 is configured to:

(1) the uplink and downlink contents interface server 603 directly communicates with the contents producing end 601, and the resources submitted from the front end, which are usually the title, publisher, abstract, cover page, publishing time of the resources, store the meta information of the contents in the contents database 605.

The recommendation distribution and content distribution export service 604 is configured to:

(1) obtaining a result of the recommended distribution, issuing the result to the content consumption end 602, and displaying the result in a Feeds list of the user;

(2) a content export service is typically a set of access services deployed geographically nearby in the vicinity of a user;

(3) setting the initial review account level of the account through the operation configuration at the entrance of the content database 605 according to the account source of the publisher, wherein a part of high-quality identification can be marked, the high-quality identification is mainly closely related to the operation strategy, and the manual review scheduling priority of the high-quality identification is higher;

(4) the statistical reporting interface and analysis service 608 is reported with the message flow information of each account, including the message sending time, the content type, and the like, and also stores the content marking information, such as classification, label, selected cover page, title, and the like, as the extension information in the content database 605.

The content database 605 is configured to:

(1) the key point is the meta-information of the content, such as the size, a cover map link, a title, release time, an account number author, a source channel and storage time, and also the classification of the content in the manual review process (including first, second and third level classification and label information, such as an article explaining Hua as a mobile phone, the first level department is science and technology, the second level classification is a smart phone, the third level classification is a domestic phone, and the label information is Hua as matrix 30);

(2) in the process of manual review, the manual review system 607 reads the information in the content database 605, and the result and state of the manual review are also returned to the content database 605 for storage;

(3) the content processing by the dispatch center 606 mainly includes machine processing by the machine auditing system 610 and manual auditing by the manual auditing system 607, for example, the teletext re-arrangement server loads content that has been activated in the content database 605 for a period of time (for example, one week, the validity period of video content is longer, for example, 3 months) according to the service requirement, and for the content repeatedly re-entered into the content database 605, the filter flag is added and is no longer provided to the recommendation distribution and content distribution export service 604, and the content is output to the user by the content consumption end 602.

The dispatch center 606 is configured to:

(1) the whole scheduling process responsible for the content flow receives the content entering the content database 605 through the uplink and downlink content interface server 603, and then obtains the meta information of the resource from the content database 605;

(2) a scheduling machine auditing system 610 for processing content attack and filtration and content repetition of politically sensitive offending law bottom lines including yellow, gambling, poison and the like;

(3) for resources which do not meet the screening conditions of the prior-issue and the subsequent-review, for example, security problems which need to be manually reviewed, a manual review system 607 is called to perform manual review processing, namely, a mechanism of the prior-issue and the subsequent-review;

(4) communicating with the potential high-quality identification accelerated scheduling service to accelerate the distribution of the resource processing link;

(5) the content scheduling processing data is reported to the statistics reporting interface and analysis service 608.

The manual review system 607 is configured to:

(1) the original information of the content itself in the content database 605 needs to be read, and is usually a system developed based on a web database with complex business, mainly to ensure that the pushed content conforms to the access allowed by local laws and policies, such as whether a round of preliminary filtering is performed with respect to pornographic, gambling, and political sensitive characteristics;

(2) receiving the contents which need manual review and are pushed by the statistical reporting interface and the analysis service 608, wherein the contents which need manual review comprise negative feedback and reporting of the counted contents which need review, and reducing and controlling the distribution risk of the contents which are sent first and then reviewed;

(3) the results of the manual review are finally written into the content database 605 through the dispatch center 606.

The statistics reporting interface and analysis service 608 is configured to:

(1) receiving the current network environment of the content consumption end 602, the click operation behavior of the user on the Feeds intermediate information and the report of the exposure data of the Feeds resources;

(2) and the Content consumption end 602 communicates with the Content consumption end, receives User interaction information such as short text, praise, forwarding and collection of the original Content (UGC) of the comment User corresponding to the reported Content, and meanwhile, performs real-time statistics on negative feedback and report information of the Content by the Content consumption end 602 according to resources, and pushes the resources exceeding a certain threshold and times to a manual auditing system for rechecking.

The potential premium identification expedited scheduling service 609 configured to:

(1) communicating with a network quality identifier repository 611 and an identifier vectorization matching service 612;

(2) based on the above-described processing flow and strategy, the method operates independently, and adjusts the scheduling strategy of link processing according to the state matched with the account: and manual examination or distribution right giving is accelerated, so that the accelerated scheduling of high-quality resources is realized.

The web crawling and parsing service 613 is configured to:

(1) according to the crawling system described above, different terminals are supported according to a platform for crawling content sources as required;

(2) according to the configured crawling rule, the contents obtained by crawling and analyzing from different platforms through the internet are written into a network high-quality identification library 611;

(3) communicating with the potential quality identifier accelerated dispatch service 609, providing an original network quality identifier library 611 as a basis source for matching the resource processing link quality identifiers, which is equivalent to processing and covering the potential quality identifiers by using an information board measurement system outside the network.

The identity vectorization matching service 612 to:

(1) according to the steps and processes described above, the system communicates with the scheduling center service 606, reads the data of the original content database 605, and performs vectorization on the account, including vectorization of the video content itself, vectorization of the tag and classification information of the content issued by the account, and the like, and finally constructs an identification vector, and then constructs an identification vectorization index library by using the identification vector;

(2) the high-quality identifiers in the network high-quality identifier library 611 obtained by crawling and analyzing are matched with the account numbers being processed in the identifier vectorization index library, so that the resource link processing matched with the potential high-quality identifiers is accelerated, and the processing efficiency and the enabling rate of the content uploaded by the corresponding potential high-quality identifiers are improved.

In order to better understand the resource identification method provided in the embodiment of the present application, the resource identification process is described below with reference to a specific application scenario.

The first platform identifies the content to be audited in the audit queue in the first platform by calling the resource identification system, and can perform targeted mining, processing and acceleration in a resource introduction stage, a resource processing stage and a resource distribution stage for high-quality accounts of other platforms (such as a second platform).

Taking the resource as a video as an example, account attribute information and user interaction information of the high-quality accounts of other platforms are crawled in a resource introduction stage so as to construct a high-quality account vector. In the resource processing stage, aiming at one video (namely, a resource to be audited) in videos to be audited in the audit queue, a target account vector is constructed according to account attribute information and user interaction information of a target account to which the video belongs, an account vectorization index library is further constructed, a high-quality account vector is matched with the account vector in the account vectorization index library, if the matching is successful, the video to be audited corresponding to the account (namely, a potential high-quality account) vector which is successfully matched in the account vectorization index library is subjected to manual audit acceleration or machine audit acceleration, the accuracy rate of matching to account processing is improved, audit acceleration is performed, and the time consumed by mistaken killing and processing of an account processing link is reduced. In the resource distribution stage, certain weighting matching is carried out on the potential high-quality account numbers, and the cold starting of the potential high-quality account numbers is accelerated.

Therefore, the investment of manpower audit can be reduced under the condition of the same manpower investment, the starting amount of the potential high-quality account of the recommendation pool is increased, the covering, processing and distribution conditions of the potential high-quality account can be monitored quantitatively, high-quality resources can be started and distributed in an accelerated way within shorter time delay, the value of the account ecological optimization of information flow content creation and distribution is exerted, and the starting rate and the distribution effect of the potential high-quality account are continuously improved.

Aiming at the resource identification method provided by the embodiment, the embodiment of the application also provides a resource identification device. Referring to fig. 7, which is a schematic diagram of a resource identification apparatus according to an embodiment of the present application, the apparatus 700 includes: an acquisition unit 701, a construction unit 702, and an execution unit 703;

the acquiring unit 701 is configured to acquire first identifier attribute information of a target identifier to which a resource to be audited belongs in a first platform and first object interaction information corresponding to a resource issued by the target identifier in the first platform, and acquire second identifier attribute information of a high-quality identifier in a second platform and second object interaction information corresponding to a resource issued by the high-quality identifier in the second platform, where the high-quality identifier is determined based on an influence parameter identified in the second platform;

the constructing unit 702 is configured to construct a target identifier vector of the target identifier according to the first identifier attribute information and the first object interaction information, and construct a high-quality identifier vector of the high-quality identifier according to the second identifier attribute information and the second object interaction information;

the execution unit 703 is configured to determine, if the similarity matching between the high-quality identifier vector and the target identifier vector is successful, that the target identifier is a potential high-quality identifier in the first platform, where the high-quality identifier and the target identifier belong to the same resource field, and promote an audit sequence of the resource to be audited in the first platform.

As a possible implementation manner, the obtaining unit 701 is further configured to:

acquiring first content of the resource to be audited and second content of the high-quality identification release resource;

the building unit 702 is configured to:

and constructing a target identification vector of the target identification according to the first identification attribute information, the first object interaction information and the first content, and constructing a high-quality identification vector of the high-quality identification according to the second identification attribute information, the second object interaction information and the second content.

As a possible implementation manner, the execution unit 703 is further configured to:

if the similarity matching between the high-quality identification vector and the target identification vector is successful, and the resource to be audited is the same as the resource issued by the high-quality identification on the second platform, determining that the target identification is the homologous account number of the high-quality identification on the first platform, and issuing the resource to be audited on the first platform.

acquiring an identification association relation for reflecting association among the accounts in the first platform;

the building unit 702 is configured to:

determining a sub-incidence relation of the target identification in the identification incidence relation according to the first object interaction information;

and generating a target identification vector of the target identification based on the first identification attribute information and the sub-incidence relation.

As a possible implementation manner, the execution unit 703 is configured to:

acquiring a high-quality identification set for similarity matching;

determining the quality degree of the high-quality identification according to the importance degree parameter of the platform where the high-quality identification is located and the influence parameter of the high-quality identification;

and promoting the auditing sequence of the resource to be audited in the first platform according to the high-quality degree of the high-quality identification.

acquiring a first number of potential high-quality identifiers in the first platform, which belong to the field of target resources, and a second number of accounts in the first platform, which belong to the field of target resources;

the execution unit 703 is configured to:

and if the ratio of the first quantity to the second quantity is smaller than a ratio threshold, improving the exposure weight of the potential high-quality identifier in the target resource field during resource recommendation.

acquiring an identification vector set used for similarity matching in the first platform;

the execution unit 703 is configured to:

and if the similarity matching between the high-quality identification vector and all the identification vectors in the identification vector set fails, sending a prompt for introducing the high-quality identification.

in a target time period, acquiring a third number of potential high-quality identifiers in the first platform and a fourth number of identifier vectors used by the first platform for similarity matching;

the execution unit 703 is configured to:

and if the ratio of the third quantity to the fourth quantity is smaller than a coverage threshold, sending out a prompt for introducing the high-quality identification.

The resource identification device may be a computer device, the computer device may be a server, or may also be a terminal device, the resource identification device may be embedded in the server or the terminal device, and the computer device provided in the embodiment of the present application will be described below from the perspective of hardware implementation. Fig. 8 is a schematic structural diagram of a server, and fig. 9 is a schematic structural diagram of a terminal device.

Referring to fig. 8, fig. 8 is a schematic diagram of a server 1400 provided by an embodiment of the present application, which may have a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 1422 (e.g., one or more processors) and a memory 1432, one or more storage media 1430 (e.g., one or more mass storage devices) for storing applications 1442 or data 1444. Memory 1432 and storage media 1430, among other things, may be transient or persistent storage. The program stored on storage medium 1430 may include one or more modules (not shown), each of which may include a sequence of instructions operating on a server. Still further, CPU 1422 may be configured to communicate with storage medium 1430 to perform a series of instruction operations on server 1400 from storage medium 1430.

The Server 1400 may also include one or more power supplies 1426, one or more wired or wireless network interfaces 1450, one or more input-output interfaces 1458, and/or one or more operating systems 1441, such as a Windows Server^TM，Mac OS X^TM，Unix^TM, Linux^TM，FreeBSD^TMAnd so on.

The steps performed by the server in the above embodiments may be based on the server structure shown in fig. 8.

The CPU 1422 is configured to perform the following steps:

Optionally, the CPU 1422 may further execute the method steps of any specific implementation of the resource identification method in the embodiment of the present application.

Referring to fig. 9, fig. 9 is a schematic structural diagram of a terminal device according to an embodiment of the present application. Fig. 9 is a block diagram illustrating a partial structure of a smartphone related to a terminal device provided in an embodiment of the present application, where the smartphone includes: a Radio Frequency (RF) circuit 1510, a memory 1520, an input unit 1530, a display unit 1540, a sensor 1550, an audio circuit 1560, a Wireless Fidelity (WiFi) module 1570, a processor 1580, and a power supply 1590. Those skilled in the art will appreciate that the smartphone configuration shown in fig. 9 is not limiting and may include more or fewer components than shown, or some components in combination, or a different arrangement of components.

The following specifically describes each component of the smartphone with reference to fig. 9:

the RF circuit 1510 may be configured to receive and transmit signals during information transmission and reception or during a call, and in particular, receive downlink information of a base station and then process the received downlink information to the processor 1580; in addition, the data for designing uplink is transmitted to the base station. In general, RF circuit 1510 includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, RF circuit 1510 may also communicate with networks and other devices via wireless communication. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Message Service (SMS), and the like.

The memory 1520 may be used to store software programs and modules, and the processor 1580 implements various functional applications and data processing of the smart phone by operating the software programs and modules stored in the memory 1520. The memory 1520 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the smartphone, and the like. Further, the memory 1520 may include high-speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.

The input unit 1530 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the smartphone. Specifically, the input unit 1530 may include a touch panel 1531 and other input devices 1532. The touch panel 1531, also referred to as a touch screen, can collect touch operations of a user (e.g., operations of the user on or near the touch panel 1531 using any suitable object or accessory such as a finger or a stylus) and drive corresponding connection devices according to a preset program. Alternatively, the touch panel 1531 may include two parts, a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, and sends the touch point coordinates to the processor 1580, and can receive and execute commands sent by the processor 1580. In addition, the touch panel 1531 may be implemented by various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The input unit 1530 may include other input devices 1532 in addition to the touch panel 1531. In particular, other input devices 1532 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.

The display unit 1540 may be used to display information input by the user or information provided to the user and various menus of the smartphone. The Display unit 1540 may include a Display panel 1541, and optionally, the Display panel 1541 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel 1531 may cover the display panel 1541, and when the touch panel 1531 detects a touch operation on or near the touch panel 1531, the touch operation is transmitted to the processor 1580 to determine the type of the touch event, and then the processor 1580 provides a corresponding visual output on the display panel 1541 according to the type of the touch event. Although in fig. 9, the touch panel 1531 and the display panel 1541 are two separate components to implement the input and output functions of the smartphone, in some embodiments, the touch panel 1531 and the display panel 1541 may be integrated to implement the input and output functions of the smartphone.

The smartphone may also include at least one sensor 1550, such as light sensors, motion sensors, and other sensors. Specifically, the light sensor may include an ambient light sensor that may adjust the brightness of the display panel 1541 according to the brightness of ambient light and a proximity sensor that may turn off the display panel 1541 and/or backlight when the smartphone is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when stationary, and can be used for applications (such as horizontal and vertical screen switching, related games, magnetometer attitude calibration) for recognizing the attitude of the smartphone, and related functions (such as pedometer and tapping) for vibration recognition; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the smart phone, further description is omitted here.

Audio circuit 1560, speaker 1561, microphone 1562 may provide an audio interface between a user and a smartphone. The audio circuit 1560 may transmit the electrical signal converted from the received audio data to the speaker 1561, and convert the electrical signal into an audio signal by the speaker 1561 and output the audio signal; on the other hand, the microphone 1562 converts collected sound signals into electrical signals, which are received by the audio circuit 1560 and converted into audio data, which are processed by the output processor 1580 and then passed through the RF circuit 1510 for transmission to, for example, another smart phone, or output to the memory 1520 for further processing.

WiFi belongs to short-distance wireless transmission technology, and the smart phone can help a user to receive and send e-mails, browse webpages, access streaming media and the like through a WiFi module 1570, and provides wireless broadband internet access for the user. Although fig. 9 shows WiFi module 1570, it is understood that it does not belong to the essential components of the smartphone and may be omitted entirely as needed within the scope not changing the essence of the invention.

The processor 1580 is a control center of the smartphone, connects various parts of the entire smartphone by using various interfaces and lines, and performs various functions of the smartphone and processes data by operating or executing software programs and/or modules stored in the memory 1520 and calling data stored in the memory 1520, thereby integrally monitoring the smartphone. Optionally, the processor 1580 may include one or more processing units; preferably, the processor 1580 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, and the like, and a modem processor, which mainly handles wireless communications. It is to be appreciated that the modem processor may not be integrated into the processor 1580.

The smartphone also includes a power supply 1590 (e.g., a battery) for powering the various components, which may preferably be logically connected to the processor 1580 via a power management system, so as to manage charging, discharging, and power consumption management functions via the power management system.

Although not shown, the smart phone may further include a camera, a bluetooth module, and the like, which are not described herein.

In an embodiment of the application, the smartphone includes a memory 1520 that can store program code and transmit the program code to the processor.

The processor 1580 included in the smart phone may execute the resource identification method provided in the foregoing embodiments according to the instructions in the program code.

The embodiment of the present application further provides a computer-readable storage medium for storing a computer program, where the computer program is used to execute the resource identification method provided by the foregoing embodiment.

Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the resource identification method provided in the various alternative implementations of the above aspects.

Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium may be at least one of the following media: various media capable of storing program codes, such as Read-Only Memory (ROM), RAM, magnetic disk, or optical disk.

It should be noted that, in the present specification, all the embodiments are described in a progressive manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus and system embodiments, since they are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described embodiments of the apparatus and system are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

The above description is only one specific embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method for resource identification, the method comprising:

constructing a target identification vector of the target identification according to the first identification attribute information and the first object interaction information, wherein the target identification vector of the target identification represents the high-quality characteristics of the target identification in the corresponding resource field; constructing a high-quality identification vector of the high-quality identification according to the second identification attribute information and the second object interaction information, wherein the high-quality identification vector of the high-quality identification represents the high-quality characteristics of the high-quality identification in the corresponding resource field;

2. The method of claim 1, further comprising:

the constructing a target identification vector of the target identification according to the first identification attribute information and the first object interaction information, and constructing a high-quality identification vector of the high-quality identification according to the second identification attribute information and the second object interaction information includes:

3. The method of claim 2, further comprising:

if the similarity matching between the high-quality identification vector and the target identification vector is successful, and the resource to be audited is the same as the resource issued by the high-quality identification on the second platform, determining that the target identification is the homologous identification of the high-quality identification on the first platform, and issuing the resource to be audited on the first platform.

4. The method of claim 1, further comprising:

acquiring an identifier association relation for reflecting association between identifiers in the first platform;

the constructing a target identification vector of the target identification according to the first identification attribute information and the first object interaction information includes:

5. The method according to claim 1, wherein the promoting of the auditing sequence of the resources to be audited in the first platform comprises:

acquiring a high-quality identification set for similarity matching;

6. The method according to any one of claims 1-5, further comprising:

7. The method according to any one of claims 1-5, further comprising:

8. The method according to any one of claims 1-5, further comprising:

9. An apparatus for identifying resources, the apparatus comprising: the device comprises an acquisition unit, a construction unit and an execution unit;

the constructing unit is configured to construct a target identifier vector of the target identifier according to the first identifier attribute information and the first object interaction information, where the target identifier vector of the target identifier represents a high-quality characteristic of the target identifier in a resource field corresponding to the target identifier; constructing a high-quality identification vector of the high-quality identification according to the second identification attribute information and the second object interaction information, wherein the high-quality identification vector of the high-quality identification represents the high-quality characteristics of the high-quality identification in the corresponding resource field;

10. The apparatus of claim 9, wherein the obtaining unit is further configured to:

the construction unit is configured to:

11. The apparatus of claim 10, wherein the execution unit is further configured to:

12. The apparatus of claim 9, wherein the obtaining unit is further configured to:

the construction unit is configured to:

13. A computer device, the device comprising a processor and a memory:

the processor is configured to perform the method of any of claims 1-8 according to instructions in the program code.

14. A computer-readable storage medium, characterized in that the computer-readable storage medium is used to store a computer program for performing the method of any one of claims 1-8.