CN116089726B - Multi-language multi-modal resource recommendation method and device for Tibetan language - Google Patents

Multi-language multi-modal resource recommendation method and device for Tibetan language Download PDF

Info

Publication number
CN116089726B
CN116089726B CN202310200016.6A CN202310200016A CN116089726B CN 116089726 B CN116089726 B CN 116089726B CN 202310200016 A CN202310200016 A CN 202310200016A CN 116089726 B CN116089726 B CN 116089726B
Authority
CN
China
Prior art keywords
resource
language
content
dialect
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310200016.6A
Other languages
Chinese (zh)
Other versions
CN116089726A (en
Inventor
于满泉
莫倩
王升
张传文
贾承斌
朱若曦
央金拉姆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wiseweb Technology Group Co ltd
Beijing Wiseweb Big Data Technology Co ltd
Original Assignee
Wiseweb Technology Group Co ltd
Beijing Wiseweb Big Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wiseweb Technology Group Co ltd, Beijing Wiseweb Big Data Technology Co ltd filed Critical Wiseweb Technology Group Co ltd
Priority to CN202310200016.6A priority Critical patent/CN116089726B/en
Publication of CN116089726A publication Critical patent/CN116089726A/en
Application granted granted Critical
Publication of CN116089726B publication Critical patent/CN116089726B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The application relates to a multi-modal resource recommendation method and device for Tibetan language multi-dialect, wherein the method comprises the following steps: identifying the language/dialect type to which each content resource belongs, and marking the content resource with a corresponding resource-language type label according to the identification result; extracting the language type and interest points of the user; screening a first content resource from various types of content resources, and performing cold start heuristics on the first content resource in a corresponding target crowd; and screening a second content resource from the first content resource according to the cold start heuristic result, and spreading the second content resource among the multilingual population. According to the scheme, based on machine translation and similar crowd diffusion, multilingual/dialect matching recommendation can be achieved, the aim that specific language resources are diffused in specific language crowds is achieved, and personalized balance is achieved between accuracy and generalization.

Description

Multi-language multi-modal resource recommendation method and device for Tibetan language
Technical Field
The application relates to the technical field of recommendation, in particular to a method and a device for recommending resources in multi-language and multi-mode modes of Tibetan language.
Background
The plagues in the Tibetan area of China are wide, and the plagues in the Tibetan area are quite different. Tibetan language is divided into three dialects of Tibetan language, kangduo language and Anduo language, and meanwhile Tibetan language is also greatly promoted in Mandarin Chinese language, so that daily habits of Tibetan language dialects and Chinese language simultaneous use are developed by Tibetan local residents. Meanwhile, the Tibetan language dialects are basically the same in the word writing method, but the spoken language pronunciation is greatly different, people among the dialects often cannot smoothly communicate by adopting the Tibetan language, and Chinese is needed to be used.
Information flow content recommendation products, such as various social platforms, which are mainly built based on recommendation engines in the market mainly adopt Chinese to provide content recommendation services, and content recommendation products which are not built for the local residents in the Tibet are not available.
For example, if the sibling of Tibetan An Duo dialect adopts a short video recorded by an safe dialect and pushes the short video to the Tibetan crowd of Wei Cang dialect through recommended products, although the sibling of the safe dialect can understand through reading caption characters or listening to Wei Cangyu sound results after speech transcription synthesis, the reading of the caption consumes time and energy of a user, the current state of the art of speech synthesis cannot perfectly restore tone, and these methods cannot bring good user experience.
Disclosure of Invention
In order to at least overcome the problem of poor resource content matching recommendation effect in multi-dialect/language scenes in the related technology to a certain extent, the application provides a multi-dialect multi-mode resource recommendation method and device for Tibetan language.
According to a first aspect of an embodiment of the present application, a multi-modal resource recommendation method for Tibetan language multi-language is provided, including the following steps:
identifying the language/dialect type to which each content resource belongs, and marking the content resource with a corresponding resource-language type label according to the identification result;
extracting the language type and interest points of the user;
screening a first content resource from various types of content resources, and performing cold start heuristics on the first content resource in a corresponding target crowd;
and screening a second content resource from the first content resource according to the cold start heuristic result, and spreading the second content resource among the multilingual population.
Further, the categories of the resource-language type tags include: graphic-Chinese, graphic-Tibetan, short video-Chinese, short video-hygienically-dialect, short video-Kang Fang language, and/or short video-Anduo dialect.
Further, identifying the language/dialect type to which each content resource belongs, including the following steps:
calling the existing universal Tibetan-Chinese multi-mode machine translation technical interface, and identifying each content resource as belonging language/dialect type;
uniformly translating the multi-mode content resources into Chinese characters and storing the Chinese characters into a content model;
wherein the multi-modal content resources include: tibetan language image text, tibetan language short video dialect voice and/or Tibetan language short video subtitle.
Further, the first content resource is screened from various types of content resources, including the following steps:
and screening out the first content resources by prior quality resource screening aiming at each type of content resources.
Further, the first content resource is subjected to cold start probing in the corresponding target crowd, and the method comprises the following steps:
acquiring a preset multilingual/dialect corresponding relation;
determining a corresponding target language crowd according to the language type of the first content resource;
and pushing the first content resource to the target language crowd, and performing cold start probing.
Further, the preset multilingual/dialect correspondence relationship includes:
content resources of the graphic-Chinese type correspond to Chinese users, defending and hiding dialect users, healthy dialect users and safety dialect users;
the content resources of the image-text-Tibetan language type correspond to a defending and hiding dialect user, a healthy dialect user and an safety dialect user;
the short video-Chinese type content resources correspond to Chinese users, defending and hiding dialect users, healthy dialect users and safety dialect users;
the short video-content resources of the defending dialect type correspond to defending dialect users;
content resources of the short video-Kangding type correspond to Kangding users;
the short video-safe dialect type content resource corresponds to an safe dialect user.
Further, the second content resource is screened from the first content resource according to the cold start heuristic result, and the method comprises the following steps:
performing posterior interactive resource screening according to the cold start trial result;
and screening out content resources with higher posterior interaction data indexes as second content resources.
Further, the second content resource is propagated and diffused among the multilingual crowd, and the method comprises the following steps:
determining multilingual populations based on a similar population diffusion method;
and carrying out propagation and diffusion on the second content resource among multilingual people.
Further, the method for diffusing similar people comprises the following steps:
writing the earliest click user of the propagation diffusion period resource into redis;
reading a redis to obtain a history click user list of a propagation diffusion period nid, so as to obtain a plurality of seed users; where nid represents the unique number of the content resource;
requesting a gcf-user vector service interface to obtain vectors of a plurality of seed users; the gcf-user vector refers to a user vector generated by Graph-model-based collaborative filtering (Graph-based Collaborative Filtering), which is common in the industry.
An average pooling method is adopted for a plurality of seed user vectors to obtain nid vectors, and a resource vector library is constructed;
requesting a gcf-user vector service interface to obtain a user vector of a target user for the target user to be recommended;
calculating cosine similarity between a user vector of a target user and nid vectors of seed user groups in a resource vector library to obtain a similarity score value of the target user and the resource seed user groups;
the system recommends the corresponding resource with high score value to the target user.
Further, the method of the present application further comprises the steps of:
according to the interaction result after the similar crowd is diffused, posterior interaction resource screening is carried out;
and carrying out right-raising display on the screened content resources in the range of the whole user.
According to a second aspect of embodiments of the present application, there is provided a multi-aspect multi-modal resource recommendation apparatus, including:
the identification module is used for identifying the language/dialect type to which each content resource belongs, and marking the corresponding resource-language type label for the content resource according to the identification result;
the extraction module is used for extracting the language type and the interest point of the user;
the cold start heuristic module is used for screening first content resources from various types of content resources and performing cold start heuristics on the first content resources in corresponding target groups;
and the propagation and diffusion module is used for screening out a second content resource from the first content resource according to the cold start heuristic result and carrying out propagation and diffusion on the second content resource among the multilingual population.
The technical scheme provided by the embodiment of the application has the following beneficial effects:
according to the scheme, based on machine translation and similar crowd diffusion, multilingual/dialect matching recommendation can be achieved, the aim that specific language resources are diffused in specific language crowds is achieved, and personalized balance is achieved between accuracy and generalization.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
FIG. 1 is a schematic diagram of content resource requirements of a non-language group in a Tibetan area according to an embodiment of the present invention.
FIG. 2 is a panoramic frame view of a recommended product technology, shown in an embodiment of the invention.
Fig. 3 is a flowchart of a multi-aspect and multi-modal resource recommendation method according to an embodiment of the present invention.
Fig. 4 is a logic diagram of a Lookalike algorithm according to an embodiment of the present invention.
FIG. 5 is a flowchart of a multi-modal recommendation technique based on Tibetan machine translation and crowd-spreading according to an embodiment of the invention.
Fig. 6 is a block diagram of a multi-aspect, multi-modal resource recommendation device according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application. Rather, they are merely examples of methods and apparatus consistent with aspects of the present application as detailed in the accompanying claims.
The method is characterized in that the method is used for building a Tibetan language multi-dialect with better experience for the body measurement of Tibetan local residents under the current situation that the spoken language pronunciation difference of various dialects in the Tibetan area is large and difficult to communicate with each other directly, and the method has important significance in the content recommendation product combined with Chinese. Related products simultaneously support multi-mode content resource forms such as graphics, short videos and the like, and the requirements of language products to be met are shown in fig. 1. In fig. 1, the solid line represents the type of resource that the user frequently sees, and the broken line represents the type of resource that the user occasionally sees.
The recommended products are very popular in the market at present, and the technical methods are very numerous. At present, the recommended products of single languages such as Chinese generally adopt a technical method of combining various models. The technical framework of the whole recommended product is shown in fig. 2.
There are two currently mainstream multilingual recommendation techniques. A method for understanding the content of "content model" includes such steps as uniformly translating the content resources from multiple languages to single language by machine translation, classifying the user by "user model", labeling the crowd of different languages, and recommending in the mode of "content-based match". The second method is to solve the problem of association of similar contents but different languages by adopting a cross-language resource joint modeling mode at the level of 'based on an end-to-end model'.
The existing technical proposal and products can also realize the recommended service of multi-dialect and multi-ethnic group people in the Tibetan area, but the use experience is not ideal enough. This is because:
1) The technical methods based on content matching and end-to-end model tend to recall content by using content behaviors (such as clicked classifications, labels, keywords and the like) of the click history of the user, so that the content is recommended, the click history preference which is not provided by the user is not recalled to the user, and the click behavior of the user and the language dialect of the user have no accurate corresponding relation, so that the content recommended to the user cannot be accurately controlled and regulated in dialect and language.
2) The method is also based on the concept of rule for refining the technical method based on content matching, such as matching and limiting the labels of people in different languages/dialects to different users, and can be controlled and regulated to a certain extent. Meanwhile, the content resources among the Tibetan language dialects and among the Tibetan language and the Chinese language are extremely unbalanced in terms of quantity distribution, and the technical method based on content matching is difficult to achieve the effect of ideal matching no matter how refined.
In fact, for the special environment of local multi-nations and multi-dialects of the Tibetan, and the preference difference of different individuals of local residents in terms of languages/dialects, a recommendation technology is needed, which can realize personalized balance between accuracy and generalization of language/dialect matching.
The invention provides a novel recommendation method, which is based on Tibetan-Han machine translation and crowd diffusion, and can effectively solve the problem of equalization of accuracy and generalization of content recommendation products for Tibetan local residents in terms of personalized matching of languages and dialects.
One of the technical methods adopted by the application is crowd spreading, and is only used for carrying out directional spreading and accelerated spreading on a specific resource type, such as a public resource form of 'pay-for-reading', and the like, depending on a specific interested crowd. This approach has not been applied in the case of "diffusion of language-specific resources among language-specific populations".
FIG. 3 is a flowchart illustrating a multi-dialect multi-modal resource recommendation method, according to an exemplary embodiment. The method may comprise the steps of:
step S1, recognizing the language/dialect type to which each content resource belongs, and marking the content resource with a corresponding resource-language type label according to the recognition result;
s2, extracting the language type and the interest point of the user;
step S3, screening out first content resources from various types of content resources, and performing cold start heuristics on the first content resources in corresponding target crowds;
and S4, screening out second content resources from the first content resources according to the cold start heuristic result, and spreading the second content resources among the multilingual population.
According to the scheme, based on machine translation and similar crowd diffusion, multilingual/dialect matching recommendation can be achieved, the aim that specific language resources are diffused in specific language crowds is achieved, and personalized balance is achieved between accuracy and generalization.
It should be understood that, although the steps in the flowchart of fig. 3 are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 3 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or other steps.
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in further detail with reference to the accompanying drawings.
As shown in fig. 4, the technical scheme of the invention is based on the idea of Lookalike. Lookalike, i.e., similar crowd spread, is a technique for finding more similar crowd with potential relevance based on seed users by evaluating a model through a certain algorithm. Lookalike is not a specific algorithm, but rather a generic term for a class of methods. The present invention uses the technique of metric learning in meta learning to model a resource representation: based on the deep understanding of the user, a representation of the early distribution of the resource is produced by clicking on the seed user of the resource instead of its points of interest, thereby modeling the new resource more accurately.
Meanwhile, in the content understanding link of a content model, a ready-to-use universal Tibetan-to-Chinese multi-mode machine translation technical interface is called, each content resource is identified to be of a language/dialect type, and resource-language type labels such as 'image-text-Chinese', 'image-Tibetan', 'short video-Chinese', 'short video-defensive dialect', 'short video-Kang Fangyan', 'short video-safety dialect' are respectively marked; uniformly translating the multi-mode content resources into Chinese characters and storing the Chinese characters into a content model; wherein the multi-modal content resources include: tibetan language image text, tibetan language short video dialect voice and Tibetan language short video subtitle. And then selecting high-quality resources in each resource type according to each resource type, and carrying out distribution and diffusion by a similar crowd diffusion method.
The above two most important technical methods for mutual association are also key innovation points of the invention. Next, a specific application scenario of the present application is described in an expanding manner with reference to the technical flowchart 5.
The method of each link is shown in the text in the flow chart. In the figure, the target language crowd of S3 refers to the crowd associated with the solid line in fig. 1, that is, the content resource of a certain language, for example, the "image-text-Tibetan" resource, and by looking up the solid line correspondence in fig. 1, three target language crowd corresponding to "defending and Tibetan dialect user", "healthy and dialect user" and "An Duofang language user" are found.
In some embodiments, step S3 screens out a first content resource from among various types of content resources, including the steps of: for each type of content resource, screening out a first content resource by priori quality resource screening
In some embodiments, step S3 performs a cold start heuristic on the first content resource in the corresponding target crowd, including the following steps: acquiring a preset multilingual/dialect corresponding relation; determining a corresponding target language crowd according to the language type of the first content resource; and pushing the first content resource to the target language crowd, and performing cold start probing.
Specifically, referring to fig. 1, the preset multilingual/dialect correspondence relationship includes:
content resources of the graphic-Chinese type correspond to Chinese users, defending and hiding dialect users, healthy dialect users and safety dialect users;
the content resources of the image-text-Tibetan language type correspond to a defending and hiding dialect user, a healthy dialect user and an safety dialect user;
the short video-Chinese type content resources correspond to Chinese users, defending and hiding dialect users, healthy dialect users and safety dialect users;
the short video-content resources of the defending dialect type correspond to defending dialect users;
content resources of the short video-Kangding type correspond to Kangding users;
the short video-safe dialect type content resource corresponds to an safe dialect user.
Next, a detailed description will be given of a link of S4 multi-language inter-crowd propagation and diffusion with reference to fig. 5.
In some embodiments, step S4 of screening a second content resource from the first content resource according to a cold start heuristic result includes the steps of: performing posterior interactive resource screening according to the cold start trial result; and screening out content resources with higher posterior interaction data indexes as second content resources.
In some embodiments, step S4 spreads the second content resource among the multilingual population, including the steps of: determining multilingual populations based on a similar population diffusion method; and carrying out propagation and diffusion on the second content resource among multilingual people.
1. Usesis real-time stream: usesis, i.e., the session of a user operation, will propagate the earliest click user of the diffusion period resource, e.g., 1000 click users, to be written into redis. nid represents the unique number of the resource.
2. The queues are offline: and (2.1) reading a history click user list of redis acquired propagation diffusion period nid to obtain a plurality of seed users. (2.2) requesting an off-the-shelf gcf-user vector service interface to obtain vectors for a plurality of seed users. Gcf-user vector services are built in a manner that is common to the industry as a user behavior graph model. (2.3) adopting a mean mapping method in a neural network for a plurality of seed user vectors to obtain nid vectors, and constructing a resource vector library.
3. The queue is online: (3.1) for the target user to be recommended, requesting the gcf-user vector service interface to obtain the user vector of the target user. And (3.2) calculating cosine similarity by using the user vector of the target user and nid vectors of the seed user group in the resource vector library to obtain a similarity score value of the target user and the resource seed user group. (3.3) the system presents the corresponding resource with the high score value to the target user. score values are used to weight other models such as rank.
In some embodiments, the method of the present application further comprises step S5: according to the interaction result after the similar crowd is diffused, posterior interaction resource screening is carried out; and carrying out right-raising display on the screened content resources in the range of the whole user.
In addition, S3, S4, S5 all relate to resource presentation. For the content resources displayed to the user, after the user clicks, the user can read in an auxiliary way by combining the text or voice transcription result generated by the Tibetan machine translation technology, and the auxiliary reading is used as a high-level option of the product, so that the user experience is further enhanced.
The application adopts the technical scheme, and has beneficial effects in the following three aspects.
1) Technical method aspect. The method is supported by the Tibetan machine translation technology, and takes the cold start trial in target language crowd to spread among multilingual crowd as a main process, so that specific requirements of multi-dialect and multi-mode recommended products of residents in Tibetan areas are subjected to targeted modeling, and the problem that the accuracy and generalization of the prior art method in the aspect of language requirement matching cannot be effectively balanced is solved.
2) And (5) user experience aspects. The local residents in the Tibet can experience recommended products which are more in accordance with language use habits of the residents, frequently read language resources can be greatly pushed, and not frequently read language resources can be pushed to small amounts of fine products. By the crowd diffusion method, the resource distribution rate can be improved, the display ratio of new resources within 24 hours of release can be greatly improved, and the freshness of recommended results is improved.
3) Scale and ecology aspects. The improvement of user experience can bring about the improvement of the use frequency, the clicking times, the reading time length and the like of the user, and further the distribution of the whole recommended product and the time length are promoted. By means of the link from cold start trial to crowd diffusion, a resource screening system is established in fact, the quality of the resources is evaluated by combining prior and posterior data of the resources, the resources are eliminated, the high-quality resources are distributed fully, and the experience, scale and ecology are improved.
FIG. 6 is a block diagram illustrating a multi-dialect, multi-modal resource recommendation device, according to an example embodiment. Referring to fig. 6, the apparatus includes: the device comprises an identification module, an extraction module, a cold start heuristic module and a propagation diffusion module.
And the identification module is used for identifying the language/dialect type to which each content resource belongs, and marking the corresponding resource-language type label for the content resource according to the identification result.
And the extraction module is used for extracting the language type and the interest point of the user.
And the cold start heuristic module is used for screening first content resources from various types of content resources and performing cold start heuristics on the first content resources in corresponding target groups.
And the propagation and diffusion module is used for screening out a second content resource from the first content resource according to the cold start heuristic result and carrying out propagation and diffusion on the second content resource among the multilingual population.
The specific steps in which the respective modules perform the operations in the apparatus of the above embodiments have been described in detail in the embodiments related to the method, and will not be explained in detail here. The respective modules in the above recommendation device may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules. It is to be understood that the same or similar parts in the above embodiments may be referred to each other, and that in some embodiments, the same or similar parts in other embodiments may be referred to.
It should be noted that in the description of the present application, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Furthermore, in the description of the present application, unless otherwise indicated, the meaning of "plurality" means at least two.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and further implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.
It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.
The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives, and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims (8)

1. A multi-modal resource recommendation method for Tibetan language is characterized by comprising the following steps:
identifying the language/dialect type to which each content resource belongs, and marking the content resource with a corresponding resource-language type label according to the identification result;
extracting the language type and interest points of the user;
screening a first content resource from various types of content resources, and performing cold start heuristics on the first content resource in a corresponding target crowd;
screening a second content resource from the first content resource according to a cold start heuristic result, and spreading the second content resource among multiple language groups; and
according to the interaction result after the similar crowd is diffused, posterior interaction resource screening is carried out;
carrying out right-raising display on the screened content resources in the range of all users; and
for the content resources displayed to the user, after the user clicks, the auxiliary reading can be performed by combining the text or voice transcription results generated by the conventional universal Tibetan-Chinese multi-mode machine translation technology;
wherein the categories of the resource-language type tags include: graphic-Chinese, graphic-Tibetan, short video-Chinese, short video-hycocyanin, short video-Kang Fang words and/or short video-Anduo words; and
the identifying the language/dialect type to which each content resource belongs comprises the following steps:
calling the existing universal Tibetan-Chinese multi-mode machine translation technical interface, and identifying each content resource as belonging language/dialect type;
uniformly translating the multi-mode content resources into Chinese characters and storing the Chinese characters into a content model;
wherein the multi-modal content resources include: tibetan language image text, tibetan language short video dialect voice and/or Tibetan language short video subtitle.
2. The multi-modal resource recommendation method of the Tibetan language multi-language of claim 1 wherein the step of screening the first content resource from among the various types of content resources comprises the steps of:
and screening out the first content resources by prior quality resource screening aiming at each type of content resources.
3. The method for multi-modal resource recommendation in Tibetan language according to any one of claims 1-2, wherein the cold start probing of the first content resource in the corresponding target crowd comprises the steps of:
acquiring a preset multilingual/dialect corresponding relation;
determining a corresponding target language crowd according to the language type of the first content resource;
and pushing the first content resource to the target language crowd, and performing cold start probing.
4. The multi-modal resource recommendation method of Tibetan language and multi-dialect of claim 3 wherein the preset multi-language/dialect correspondence comprises:
content resources of the graphic-Chinese type correspond to Chinese users, defending and hiding dialect users, healthy dialect users and safety dialect users;
the content resources of the image-text-Tibetan language type correspond to a defending and hiding dialect user, a healthy dialect user and an safety dialect user;
the short video-Chinese type content resources correspond to Chinese users, defending and hiding dialect users, healthy dialect users and safety dialect users;
the short video-content resources of the defending dialect type correspond to defending dialect users;
content resources of the short video-Kangding type correspond to Kangding users;
the short video-safe dialect type content resource corresponds to an safe dialect user.
5. The multi-modal resource recommendation method according to any one of claims 1-2, wherein the step of screening the second content resource from the first content resource according to a cold start heuristic result comprises the steps of:
performing posterior interactive resource screening according to the cold start trial result;
and screening out content resources with higher posterior interaction data indexes as second content resources.
6. The multi-modal resource recommendation method of Tibetan language and multi-language of claim 5 wherein the second content resource is propagated and diffused among the multi-language groups of people, comprising the steps of:
determining multilingual populations based on a similar population diffusion method;
and carrying out propagation and diffusion on the second content resource among multilingual people.
7. The multi-modal resource recommendation method of Tibetan language, as recited in claim 6, wherein the method of similar crowd diffusion comprises the steps of:
writing the earliest click user of the propagation diffusion period resource into redis;
reading a redis to obtain a history click user list of a propagation diffusion period nid, so as to obtain a plurality of seed users; where nid represents the unique number of the content resource;
requesting a gcf-user vector service interface to obtain vectors of a plurality of seed users;
an average pooling method is adopted for a plurality of seed user vectors to obtain nid vectors, and a resource vector library is constructed;
requesting a gcf-user vector service interface to obtain a user vector of a target user for the target user to be recommended;
calculating cosine similarity between a user vector of a target user and nid vectors of seed user groups in a resource vector library to obtain a similarity score value of the target user and the resource seed user groups;
the system recommends the corresponding resource with high score value to the target user.
8. A multi-modal resource recommendation device for Tibetan language, comprising:
the identification module is used for identifying the language/dialect type to which each content resource belongs, and marking the corresponding resource-language type label for the content resource according to the identification result;
the extraction module is used for extracting the language type and the interest point of the user;
the cold start heuristic module is used for screening first content resources from various types of content resources and performing cold start heuristics on the first content resources in corresponding target groups;
the transmission diffusion module is used for screening out second content resources from the first content resources according to the cold start heuristic result and carrying out transmission diffusion on the second content resources among the multilingual population; and
according to the interaction result after the similar crowd is diffused, posterior interaction resource screening is carried out;
carrying out right-raising display on the screened content resources in the range of all users; and
for the content resources displayed to the user, after the user clicks, the auxiliary reading can be performed by combining the text or voice transcription results generated by the conventional universal Tibetan-Chinese multi-mode machine translation technology;
wherein the categories of the resource-language type tags include: graphic-Chinese, graphic-Tibetan, short video-Chinese, short video-hycocyanin, short video-Kang Fang words and/or short video-Anduo words; and
the identifying the language/dialect type to which each content resource belongs comprises the following steps:
calling the existing universal Tibetan-Chinese multi-mode machine translation technical interface, and identifying each content resource as belonging language/dialect type;
uniformly translating the multi-mode content resources into Chinese characters and storing the Chinese characters into a content model;
wherein the multi-modal content resources include: tibetan language image text, tibetan language short video dialect voice and/or Tibetan language short video subtitle.
CN202310200016.6A 2023-03-06 2023-03-06 Multi-language multi-modal resource recommendation method and device for Tibetan language Active CN116089726B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310200016.6A CN116089726B (en) 2023-03-06 2023-03-06 Multi-language multi-modal resource recommendation method and device for Tibetan language

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310200016.6A CN116089726B (en) 2023-03-06 2023-03-06 Multi-language multi-modal resource recommendation method and device for Tibetan language

Publications (2)

Publication Number Publication Date
CN116089726A CN116089726A (en) 2023-05-09
CN116089726B true CN116089726B (en) 2023-07-14

Family

ID=86186982

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310200016.6A Active CN116089726B (en) 2023-03-06 2023-03-06 Multi-language multi-modal resource recommendation method and device for Tibetan language

Country Status (1)

Country Link
CN (1) CN116089726B (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10394965B2 (en) * 2017-01-13 2019-08-27 Sap Se Concept recommendation based on multilingual user interaction
CN110164415B (en) * 2019-04-29 2024-06-14 腾讯科技(深圳)有限公司 Recommendation method, device and medium based on voice recognition
CN110162703B (en) * 2019-05-13 2024-08-20 深圳市雅阅科技有限公司 Content recommendation method, training device, content recommendation equipment and storage medium
CN111966900A (en) * 2020-08-17 2020-11-20 中国银行股份有限公司 User cold start product recommendation method and system based on locality sensitive hashing
CN114996435A (en) * 2021-03-01 2022-09-02 腾讯科技(深圳)有限公司 Information recommendation method, device, equipment and storage medium based on artificial intelligence
CN113792212B (en) * 2021-08-31 2023-09-01 北京百度网讯科技有限公司 Multimedia resource recommendation method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN116089726A (en) 2023-05-09

Similar Documents

Publication Publication Date Title
Bragg et al. Sign language recognition, generation, and translation: An interdisciplinary perspective
US9613268B2 (en) Processing of images during assessment of suitability of books for conversion to audio format
US20180249193A1 (en) Method and apparatus for generating video data using textual data
US20180366013A1 (en) System and method for providing an interactive visual learning environment for creation, presentation, sharing, organizing and analysis of knowledge on subject matter
JP2019527371A (en) Voiceprint identification method and apparatus
CN112733042B (en) Recommendation information generation method, related device and computer program product
JP7108259B2 (en) Methods, apparatus, servers, computer readable storage media and computer programs for generating information
CN110263340B (en) Comment generation method, comment generation device, server and storage medium
US20210133623A1 (en) Self-supervised object detector training using raw and unlabeled videos
Chakravarthi et al. Dravidianmultimodality: A dataset for multi-modal sentiment analysis in tamil and malayalam
CN116955591A (en) Recommendation language generation method, related device and medium for content recommendation
US11653071B2 (en) Responsive video content alteration
CN113901263A (en) Label generating method and device for video material
CN116089726B (en) Multi-language multi-modal resource recommendation method and device for Tibetan language
US20240037941A1 (en) Search results within segmented communication session content
WO2023235580A1 (en) Video-based chapter generation for a communication session
US20240086452A1 (en) Tracking concepts within content in content management systems and adaptive learning systems
Mirza et al. Alignarr: Aligning narratives on movies
Pan et al. Individual differences in identifying creative metaphors from video Ads
CN116980665A (en) Video processing method, device, computer equipment, medium and product
Stamenković The stylistic journey of a video game: a diachronic approach to multimodality in the football manager series
JP5337705B2 (en) Chinese banner generation
Wei et al. MSEVA: A System for Multimodal Short Videos Emotion Visual Analysis
Badiola et al. Evaluation of Improved Components of AMIS Project for Speech Recognition, Machine Translation and Video/Audio/Text Summarization
CN113254633B (en) Message document generation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant