CN117591697A - Text recommendation method and system based on artificial intelligence and video processing - Google Patents

Text recommendation method and system based on artificial intelligence and video processing Download PDF

Info

Publication number
CN117591697A
CN117591697A CN202410078311.3A CN202410078311A CN117591697A CN 117591697 A CN117591697 A CN 117591697A CN 202410078311 A CN202410078311 A CN 202410078311A CN 117591697 A CN117591697 A CN 117591697A
Authority
CN
China
Prior art keywords
book
paragraph
user
target
interest
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410078311.3A
Other languages
Chinese (zh)
Other versions
CN117591697B (en
Inventor
和彩霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Yadu Kesheng Technology Co ltd
Original Assignee
Chengdu Yadu Kesheng Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Yadu Kesheng Technology Co ltd filed Critical Chengdu Yadu Kesheng Technology Co ltd
Priority to CN202410078311.3A priority Critical patent/CN117591697B/en
Publication of CN117591697A publication Critical patent/CN117591697A/en
Application granted granted Critical
Publication of CN117591697B publication Critical patent/CN117591697B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7844Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using original textual content or text extracted from visual content or transcript of audio data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/735Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0499Feedforward networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Library & Information Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a text recommendation method and a system based on artificial intelligence and video processing, and relates to the technical field of text recommendation; if the book reading operation of the user is detected, a front camera is opened to acquire a video when the user reads the book, and a mobile phone screen is recorded to acquire a screen recorded video; inputting the video of the user reading the book and the screen recorded video into an interest paragraph determination model to determine a plurality of initial interest paragraphs in the book; determining a target paragraph based on text content corresponding to a plurality of initial interest paragraphs in the book; generating paragraph description images based on the text content of the target paragraphs by using a generated countermeasure network; inputting the paragraph description image into a cover determination model to obtain a target book cover; and recommending the target book corresponding to the target book cover to the user, wherein the method can accurately recommend the book text suitable for the user.

Description

Text recommendation method and system based on artificial intelligence and video processing
Technical Field
The invention relates to the technical field of text recommendation, in particular to a text recommendation method and system based on artificial intelligence and video processing.
Background
With the continuous progress of technology, artificial intelligence has become a hot topic in today's society. In the field of book recommendation, conventional recommendation methods generally recommend based on factors such as historical reading records of users, popularity of books, and the like. However, these methods often fail to accurately meet the personalized needs of the user. Firstly, although the recommendation method based on the historical reading record of the user can reflect the reading interest of the user to a certain extent, the reading interest of the user is continuously changed, the historical record can only reflect the past condition, the future reading requirement can not be predicted, and the recommendation result is often low in accuracy. Secondly, although the recommending method based on the popularity of books can recommend some popular books, the popularity does not necessarily represent the preference degree of the user to the books, so the recommending accuracy of the method is low, and the personalized requirements of the user cannot be accurately met.
How to accurately recommend book text suitable for users is a current urgent problem to be solved.
Disclosure of Invention
The invention mainly solves the technical problem of how to accurately recommend the book text suitable for the user.
According to a first aspect, the present invention provides a text recommendation method based on artificial intelligence and video processing, comprising: detecting whether a user starts a book reading operation; if the book reading operation of the user is detected, a front camera is opened to acquire a video when the user reads the book, and a mobile phone screen is recorded to acquire a screen recorded video; inputting the video of the user reading the book and the screen recorded video into an interest paragraph determination model to determine a plurality of initial interest paragraphs in the book; determining a target paragraph based on text content corresponding to a plurality of initial interest paragraphs in the book; generating paragraph description images based on the text content of the target paragraphs by using a generated countermeasure network; inputting the paragraph description image into a cover determination model to obtain a target book cover; and recommending the target book corresponding to the target book cover to a user.
Further, the interest paragraph determination model is a transducer model, the input of the interest paragraph determination model is a video when the user reads a book and the screen recorded video, and the output of the interest paragraph determination model is a plurality of initial interest paragraphs in the book.
Still further, the method further comprises: acquiring uninteresting operation of a user on the target book; in response to the uninteresting operation of the user on the target book, removing the target paragraph from the initial interest paragraphs to obtain multiple removed paragraphs; generating a plurality of paragraph description images by using the generating countermeasure network based on the text content of the plurality of paragraphs after the elimination; acquiring a paragraph description image selected by a user, wherein the paragraph description image selected by the user is a paragraph description image selected by the user from a plurality of paragraph description images; inputting the paragraph description image selected by the user into the cover determination model to obtain a book cover to be recommended; and recommending the books to be recommended corresponding to the book covers to be recommended to the user.
Still further, the detecting whether the user starts the book reading operation includes: it is detected whether the user clicks the start reading button.
Still further, the input of the generated countermeasure network is the text content of the target paragraph, and the output of the generated countermeasure network is the paragraph description image.
According to a second aspect, the present invention provides a text recommendation system based on artificial intelligence and video processing, comprising: the detection module is used for detecting whether a user starts book reading operation or not;
the acquisition module is used for opening the front camera to acquire the video when the user reads the book and recording the mobile phone screen to acquire the screen recorded video if the user is detected to open the book reading operation;
an initial paragraph determination module for inputting the video of the user reading the book and the screen recorded video into an interest paragraph determination model to determine a plurality of initial interest paragraphs in the book;
a target paragraph determining module, configured to determine a target paragraph based on text contents corresponding to a plurality of initial interest paragraphs in the book;
a paragraph description image generation module for generating a paragraph description image based on the text content of the target paragraph using a generation countermeasure network;
the target book cover determining module is used for inputting the paragraph description image into a cover determining model to obtain a target book cover;
and the recommending module is used for recommending the target book corresponding to the target book cover to the user.
Further, the interest paragraph determination model is a transducer model, the input of the interest paragraph determination model is a video when the user reads a book and the screen recorded video, and the output of the interest paragraph determination model is a plurality of initial interest paragraphs in the book.
Still further, the system is further configured to:
acquiring uninteresting operation of a user on the target book;
in response to the uninteresting operation of the user on the target book, removing the target paragraph from the initial interest paragraphs to obtain multiple removed paragraphs;
generating a plurality of paragraph description images by using the generating countermeasure network based on the text content of the plurality of paragraphs after the elimination;
acquiring a paragraph description image selected by a user, wherein the paragraph description image selected by the user is a paragraph description image selected by the user from a plurality of paragraph description images;
inputting the paragraph description image selected by the user into the cover determination model to obtain a book cover to be recommended;
and recommending the books to be recommended corresponding to the book covers to be recommended to the user.
Still further, the detection module is further configured to: it is detected whether the user clicks the start reading button.
Still further, the input of the generated countermeasure network is the text content of the target paragraph, and the output of the generated countermeasure network is the paragraph description image.
The invention provides a text recommendation method and a system based on artificial intelligence and video processing, wherein the method comprises the steps of detecting whether a user starts book reading operation or not; if the book reading operation of the user is detected, a front camera is opened to acquire a video when the user reads the book, and a mobile phone screen is recorded to acquire a screen recorded video; inputting the video of the user reading the book and the screen recorded video into an interest paragraph determination model to determine a plurality of initial interest paragraphs in the book; determining a target paragraph based on text content corresponding to a plurality of initial interest paragraphs in the book; generating paragraph description images based on the text content of the target paragraphs by using a generated countermeasure network; inputting the paragraph description image into a cover determination model to obtain a target book cover; and recommending the target book corresponding to the target book cover to the user, wherein the method can accurately recommend the book text suitable for the user.
Drawings
FIG. 1 is a schematic flow chart of a text recommendation method based on artificial intelligence and video processing according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of recommending books to be recommended according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a text recommendation system based on artificial intelligence and video processing according to an embodiment of the present invention.
Detailed Description
In the embodiment of the invention, a text recommendation method based on artificial intelligence and video processing is provided as shown in fig. 1, and the text recommendation method based on artificial intelligence and video processing comprises the following steps of S1-S7:
step S1, detecting whether a user starts book reading operation.
Whether the user is performing a book reading operation is determined by monitoring the user's behavior or operation, such as clicking a specific button or entering a specific mode. In some embodiments, whether the user initiates the book reading operation may be determined by detecting whether the user clicks a start reading button. As an example, a button of "start reading" is displayed in the mobile phone screen, and if the user clicks the button of "start reading", it is indicated that the user opens the book reading operation.
And S2, if the fact that the user starts the book reading operation is detected, opening a front camera to obtain videos when the user reads the book, and recording a mobile phone screen to obtain screen recorded videos.
The front camera is a camera of which the front face of the equipment faces towards the user and is used for shooting the face of the user.
The video of the user reading the book is a continuous image sequence containing the user's face and actions acquired by the front camera. The video of the user reading the book records the user's behavior and response during the reading process.
The screen recorded video obtained by the mobile phone screen is recorded video of screen content obtained through a mobile phone screen recording function. The video can capture the operations of scrolling, page turning, labeling and the like when a user reads books.
And S3, inputting the video of the user reading the book and the screen recorded video into an interest paragraph determining model to determine a plurality of initial interest paragraphs in the book.
The interest paragraph determination model is a transducer model, the input of the interest paragraph determination model is a video when the user reads a book and the screen recorded video, and the output of the interest paragraph determination model is a plurality of initial interest paragraphs in the book. The transducer model is one implementation of artificial intelligence.
The transducer establishes a global dependency relationship in the input sequence through a self-attention mechanism, and can process the relevance among different elements at the same time. The transducer model can understand the semantics of each element in the video and the screen recording video information when the user reads the book, and capture the association relationship between the video and the screen recording video information. The transducer model includes an Encoder (Encoder) and a Decoder (Decoder). Each section is formed by stacking a plurality of identical layers. The encoder functions to learn the representation of the input sequence, including the self-attention mechanism and the Feed-Forward Network (Feed-Forward Network). The decoder introduces an additional Multi-Head Attention mechanism (Multi-Head Attention) on the basis of the encoder for decoding the encoder output and generating the target sequence.
The transducer model can capture global information and local associations in the input sequence to better understand the context in the sequence. For the video and the screen recorded video when the user reads, the data can be time series data, wherein the time series data comprise information such as actions, facial expressions and the like of the user in the reading process and the sequence display of contents such as characters, pictures and the like on the screen. In reading videos and screen recording videos, a transducer can learn information such as text content in the videos and behavior feedback of users, and map the information into semantic spaces related to book content for understanding and analysis. As an example, in reading a video, a user may briefly stay on a paragraph or page a number of times, and such repeated gazing or stay behavior may also be effectively captured by the transducer model, helping the model determine the paragraphs of interest to the user.
In some embodiments, the interest paragraph determination model includes a video matching layer, a paragraph action determination layer, a degree of interest determination layer, and an interest paragraph screening layer. The video matching layer, the paragraph action determining layer, the interest degree determining layer and the interest paragraph screening layer all comprise a transducer structure. The input of the video matching layer is the video when the user reads the book and the screen recorded video, the output of the video matching layer is the segmented video when the user reads the book corresponding to each paragraph of the book, the screen recorded segmented video corresponding to each paragraph of the book, the input of the paragraph action determining layer is the segmented video when the user reads the book corresponding to each paragraph of the book, the screen recorded segmented video corresponding to each paragraph of the book, the output of the paragraph action determining layer is the reading time length corresponding to each paragraph of the book, the facial expression sequence, the gesture operation of the user and the eye action sequence, the input of the interest degree determining layer is the reading time length corresponding to each paragraph of the book, the facial expression sequence, the gesture operation of the user and the eye action sequence, the output of the interest degree determining layer is the interest degree of each paragraph of the book, the input of the interest paragraph screening layer is the interest degree of each paragraph of the book, and the output of the interest screening layer is a plurality of initial interest paragraphs.
The reading time length, the facial expression sequence, the gesture operation of the user and the eye action sequence corresponding to each paragraph of the book can be used for judging the interest degree of the user in each paragraph, so that a plurality of initial interest paragraphs are determined. As an example, the reading duration corresponding to each paragraph of the book is the duration that the user stays on each paragraph. A long dwell may indicate that the user is interested in the paragraph content, while a short dwell may indicate lack of interest. The facial expression sequence corresponding to each paragraph of the book is a change sequence of facial expressions of a user in the reading process. The sequence of facial expressions corresponding to each paragraph of the book may represent the user's interest in that paragraph, as an example, if the user exhibits a high level of excitement or concentration of facial expressions, they may be considered to be interested in the paragraph content. Gesture operations of the user corresponding to each paragraph of the book record gesture operations such as page turning, marking and swiping of the user in the reading process. As an example, the gesture operation of the user is related to the user's interest in a specific paragraph, for example, if the user marks or highlights a certain paragraph, it indicates that the user's interest in the paragraph is high. The eye motion sequence refers to a deformed sequence of eye movements of the user. By analyzing the sequence of eye movements of the user, one can learn how much and how much attention they are focusing on each paragraph. For example, looking at a particular paragraph for a long period of time may indicate that the user is interested in the paragraph content, while frequent glances and rapid jumps may indicate that the paragraph is of low interest or difficult to understand.
And S4, determining a target paragraph based on the text contents corresponding to the initial interest paragraphs in the book.
The target paragraph is a paragraph that is more representative of content screened from the plurality of initial interest paragraphs in the book for book recommendation. As an example, one target paragraph may be: "a beautiful sunset, an afterglow of the sunset, is sprinkled on the lake surface, and shows golden yellow brilliance".
In some embodiments, the target paragraph may be determined based on a target paragraph determination model. The target paragraph determination model is a deep neural network model. The input of the target paragraph determining model is text content corresponding to a plurality of initial interest paragraphs in the book, and the output of the target paragraph determining model is the target paragraph. The deep neural network model includes a deep neural network (Deep Neural Networks, DNN). The deep neural network model is one implementation of artificial intelligence. The deep neural network may include a recurrent neural network (Recurrent Neural Network, RNN), a convolutional neural network (Convolutional Neural Networks, CNN), and so on. In the training process, the deep neural network model optimizes the weight and deviation of the deep neural network model through multiple iterations, so that the understanding capability of the deep neural network model on text data is gradually improved. The model learned representation may capture similarity, relevance, and semantic information between different paragraphs, enabling inference of a target paragraph with more representative content from the initial paragraph of interest entered. When a new plurality of initial interest paragraphs are entered, the deep neural network model maps the input to a more representative target paragraph of content associated therewith based on knowledge it learns during the training process. This mapping is based on the abstraction and understanding of the text data by a model, which can identify a more representative target paragraph of content by pattern matching, semantic association, etc.
And step S5, generating paragraph description images based on the text content of the target paragraphs by using a generation countermeasure network.
The input of the generated countermeasure network is the text content of the target paragraph, and the output of the generated countermeasure network is the paragraph description image. The generation of the countermeasure network (Generative Adversarial Network, GAN for short) comprises a generator and a discriminator, which are mutually game and are continuously optimized to achieve the purpose of generating the vivid data.
The generation of the countermeasure network can learn the characteristic representation of the data through a training process. The generator may generate realistic image samples and the arbiter may attempt to distinguish the image generated by the generator from the real image. As training proceeds, the generator gradually learns to generate more realistic images, and the arbiter gradually and accurately determines which images are generated. The generation of the countermeasure network may translate the literal content of the target paragraph into a paragraph description image.
The paragraph descriptive image is an image generated by generating an countermeasure network corresponding to the literal content of the target paragraph.
And S6, inputting the paragraph description image into a cover determination model to obtain the target book cover.
The cover determination model is a deep neural network model. And the input of the cover determination model is the paragraph description image, and the output of the cover determination model is the cover of the target book. The target book cover is the book cover that is the most similar or the most matched to the paragraph descriptive image.
As an example, the paragraph description image describes that "at the afternoon of a sunny day, a girl planted with her grandpa with a bright carnation in a garden, two faces overflowed with happy smiles", and the target book cover determined by the cover determination model is a related story book cover named "memory in the sun", a similar scene is presented on the cover, a warm moment of the girl and grandpa is presented, and information of the title and the author is added. The cover determination model may learn the association between the paragraph description image and the target book cover through a training process. By generating a paragraph description image generated against the network, the contents described in the paragraph and the characteristics thereof can be reflected more accurately. Thus, when the target book covers are matched according to the paragraph description images, the matching accuracy and precision can be improved. By converting the paragraph description into an image and generating a corresponding book cover, a more visual and intuitive reading experience can be provided for the user. The user can directly know the theme, emotion, style and the like of the books through the cover images, so that the books interested in the user can be selected and read more conveniently.
And S7, recommending the target book corresponding to the target book cover to a user.
The target book corresponding to the target book cover is a book corresponding to the target book cover.
In some embodiments, if the user is not interested in the target book, the book to be recommended may be recommended through fig. 2, and fig. 2 is a schematic flow chart of the recommended book to be recommended provided in the embodiment of the present invention, where the recommended book to be recommended includes steps S21 to S26:
step S21, the uninteresting operation of the user on the target book is obtained.
This step refers to obtaining operations or feedback that the user is not interested in the target book representation. As an example, a user clicking on a "no interest" button or other corresponding operation on an application or platform indicates that the target book is not of interest.
Step S22, in response to the user' S uninteresting operation on the target book, the target paragraph is removed from the multiple initial interest paragraphs to obtain multiple removed paragraphs.
And removing the target paragraph from the multiple initial interest paragraphs containing the target paragraph according to the operation that the user is not interested in the target book, so as to obtain multiple removed paragraphs.
As an example, the plurality of initial interest paragraphs include an "a" segment, a "B" segment, a "C" segment, a "D" segment, and the target paragraph is a "B" segment, and the plurality of culled paragraphs are an "a" segment, a "C" segment, and a "D" segment.
Step S23, generating a plurality of paragraph description images by using the generating countermeasure network based on the text content of the plurality of paragraphs after the elimination.
The description about generating the countermeasure network can be referred to as step S5. The generation countermeasure network can generate a plurality of paragraph description images according to the text content of the plurality of paragraphs after being removed.
Step S24, acquiring a paragraph description image selected by a user, wherein the paragraph description image selected by the user is a paragraph description image selected by the user from a plurality of paragraph description images.
This step refers to the user selecting and confirming a paragraph description image of interest from the generated plurality of paragraph description images. As an example, a user selects one of the presented plurality of paragraph description images, indicating an interest in the corresponding paragraph.
Step S25, inputting the paragraph description image selected by the user into the cover determining model to obtain the book cover to be recommended.
The cover determining model takes the paragraph description image selected by the user as input to generate a corresponding book cover to be recommended. The description of the cover determination model may be referred to as step S6.
And S26, recommending the books to be recommended corresponding to the book covers to be recommended to the user.
The method comprises the steps of associating book covers to be recommended generated according to paragraph description images selected by a user with corresponding books to be recommended, and recommending the books to be recommended to the user.
Based on the same inventive concept, fig. 3 is a schematic diagram of a text recommendation system based on artificial intelligence and video processing according to an embodiment of the present invention, where the text recommendation system based on artificial intelligence and video processing includes:
a detection module 31, configured to detect whether a user starts a book reading operation;
the acquiring module 32 is configured to, if it is detected that the user starts the book reading operation, open the front camera to acquire the video when the user reads the book and record the mobile phone screen at the same time to obtain the screen recorded video;
an initial paragraph determination module 33 for inputting the video of the user reading the book and the screen recorded video into an interest paragraph determination model to determine a plurality of initial interest paragraphs in the book;
a target paragraph determining module 34, configured to determine a target paragraph based on text content corresponding to a plurality of initial interesting paragraphs in the book;
a paragraph description image generation module 35 for generating a paragraph description image based on the text content usage of the target paragraph;
the target book cover determining module 36 is configured to input the paragraph description image into a cover determining model to obtain a target book cover;
and the recommending module 37 is configured to recommend the target book corresponding to the target book cover to a user.

Claims (10)

1. A text recommendation method based on artificial intelligence and video processing, comprising:
detecting whether a user starts a book reading operation;
if the book reading operation of the user is detected, a front camera is opened to acquire a video when the user reads the book, and a mobile phone screen is recorded to acquire a screen recorded video;
inputting the video of the user reading the book and the screen recorded video into an interest paragraph determination model to determine a plurality of initial interest paragraphs in the book;
determining a target paragraph based on text content corresponding to a plurality of initial interest paragraphs in the book;
generating paragraph description images based on the text content of the target paragraphs by using a generated countermeasure network;
inputting the paragraph description image into a cover determination model to obtain a target book cover;
and recommending the target book corresponding to the target book cover to a user.
2. The text recommendation method based on artificial intelligence and video processing of claim 1, wherein the interest paragraph determination model is a transducer model, the input of the interest paragraph determination model is video of the user reading a book and the screen recorded video, and the output of the interest paragraph determination model is a plurality of initial interest paragraphs in the book.
3. The artificial intelligence and video processing based text recommendation method according to claim 1, wherein the method further comprises:
acquiring uninteresting operation of a user on the target book;
in response to the uninteresting operation of the user on the target book, removing the target paragraph from the initial interest paragraphs to obtain multiple removed paragraphs;
generating a plurality of paragraph description images by using the generating countermeasure network based on the text content of the plurality of paragraphs after the elimination;
acquiring a paragraph description image selected by a user, wherein the paragraph description image selected by the user is a paragraph description image selected by the user from a plurality of paragraph description images;
inputting the paragraph description image selected by the user into the cover determination model to obtain a book cover to be recommended;
and recommending the books to be recommended corresponding to the book covers to be recommended to the user.
4. The text recommendation method based on artificial intelligence and video processing according to claim 1, wherein the detecting whether a user starts a book reading operation comprises: it is detected whether the user clicks the start reading button.
5. The text recommendation method based on artificial intelligence and video processing of claim 1, wherein the input to the generating countermeasure network is text content of the target paragraph and the output to the generating countermeasure network is paragraph description image.
6. A text recommendation system based on artificial intelligence and video processing, comprising:
the detection module is used for detecting whether a user starts book reading operation or not;
the acquisition module is used for opening the front camera to acquire the video when the user reads the book and recording the mobile phone screen to acquire the screen recorded video if the user is detected to open the book reading operation;
an initial paragraph determination module for inputting the video of the user reading the book and the screen recorded video into an interest paragraph determination model to determine a plurality of initial interest paragraphs in the book;
a target paragraph determining module, configured to determine a target paragraph based on text contents corresponding to a plurality of initial interest paragraphs in the book;
a paragraph description image generation module for generating a paragraph description image based on the text content of the target paragraph using a generation countermeasure network;
the target book cover determining module is used for inputting the paragraph description image into a cover determining model to obtain a target book cover;
and the recommending module is used for recommending the target book corresponding to the target book cover to the user.
7. The text recommendation system based on artificial intelligence and video processing of claim 6 wherein the paragraph of interest determination model is a transducer model, the input of the paragraph of interest determination model is video of the user reading a book and the screen recorded video, and the output of the paragraph of interest determination model is a plurality of initial paragraphs of interest in the book.
8. The artificial intelligence and video processing based text recommendation system according to claim 6, wherein the system is further configured to:
acquiring uninteresting operation of a user on the target book;
in response to the uninteresting operation of the user on the target book, removing the target paragraph from the initial interest paragraphs to obtain multiple removed paragraphs;
generating a plurality of paragraph description images by using the generating countermeasure network based on the text content of the plurality of paragraphs after the elimination;
acquiring a paragraph description image selected by a user, wherein the paragraph description image selected by the user is a paragraph description image selected by the user from a plurality of paragraph description images;
inputting the paragraph description image selected by the user into the cover determination model to obtain a book cover to be recommended;
and recommending the books to be recommended corresponding to the book covers to be recommended to the user.
9. The text recommendation system based on artificial intelligence and video processing of claim 6, wherein the detection module is further configured to: it is detected whether the user clicks the start reading button.
10. The artificial intelligence and video processing based text recommendation system of claim 6, wherein the input to the generate countermeasure network is textual content of the target paragraph and the output to the generate countermeasure network is a paragraph description image.
CN202410078311.3A 2024-01-19 2024-01-19 Text recommendation method and system based on artificial intelligence and video processing Active CN117591697B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410078311.3A CN117591697B (en) 2024-01-19 2024-01-19 Text recommendation method and system based on artificial intelligence and video processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410078311.3A CN117591697B (en) 2024-01-19 2024-01-19 Text recommendation method and system based on artificial intelligence and video processing

Publications (2)

Publication Number Publication Date
CN117591697A true CN117591697A (en) 2024-02-23
CN117591697B CN117591697B (en) 2024-03-29

Family

ID=89922412

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410078311.3A Active CN117591697B (en) 2024-01-19 2024-01-19 Text recommendation method and system based on artificial intelligence and video processing

Country Status (1)

Country Link
CN (1) CN117591697B (en)

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103839062A (en) * 2014-03-11 2014-06-04 东方网力科技股份有限公司 Image character positioning method and device
CN104298682A (en) * 2013-07-18 2015-01-21 广州华久信息科技有限公司 Information recommendation effect evaluation method and mobile phone based on facial expression images
CN107169002A (en) * 2017-03-31 2017-09-15 咪咕数字传媒有限公司 A kind of personalized interface method for pushing and device recognized based on face
CN107679070A (en) * 2017-08-22 2018-02-09 科大讯飞股份有限公司 Intelligent reading recommendation method and device and electronic equipment
CN109213932A (en) * 2018-08-09 2019-01-15 咪咕数字传媒有限公司 Information pushing method and device
CN111930667A (en) * 2020-07-09 2020-11-13 上海连尚网络科技有限公司 Method and device for book recommendation in reading application
CN111931062A (en) * 2020-08-28 2020-11-13 腾讯科技(深圳)有限公司 Training method and related device of information recommendation model
CN113642673A (en) * 2021-08-31 2021-11-12 北京字跳网络技术有限公司 Image generation method, device, equipment and storage medium
US20220253721A1 (en) * 2021-01-30 2022-08-11 Walmart Apollo, Llc Generating recommendations using adversarial counterfactual learning and evaluation
US20220343100A1 (en) * 2021-04-23 2022-10-27 Ping An Technology (Shenzhen) Co., Ltd. Method for cutting video based on text of the video and computing device applying method
CN115461793A (en) * 2020-02-29 2022-12-09 具象有限公司 System and method for interactive multimodal book reading
CN115797488A (en) * 2022-11-28 2023-03-14 科大讯飞股份有限公司 Image generation method and device, electronic equipment and storage medium
US20230196000A1 (en) * 2021-12-21 2023-06-22 Woongjin Thinkbig Co., Ltd. System and method for providing personalized book
CN116485948A (en) * 2023-04-30 2023-07-25 上海芯赛云计算科技有限公司 Text image generation method and system based on recommendation algorithm and diffusion model
CN117216535A (en) * 2023-02-16 2023-12-12 腾讯科技(深圳)有限公司 Training method, device, equipment and medium for recommended text generation model

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104298682A (en) * 2013-07-18 2015-01-21 广州华久信息科技有限公司 Information recommendation effect evaluation method and mobile phone based on facial expression images
CN103839062A (en) * 2014-03-11 2014-06-04 东方网力科技股份有限公司 Image character positioning method and device
CN107169002A (en) * 2017-03-31 2017-09-15 咪咕数字传媒有限公司 A kind of personalized interface method for pushing and device recognized based on face
CN107679070A (en) * 2017-08-22 2018-02-09 科大讯飞股份有限公司 Intelligent reading recommendation method and device and electronic equipment
CN109213932A (en) * 2018-08-09 2019-01-15 咪咕数字传媒有限公司 Information pushing method and device
CN115461793A (en) * 2020-02-29 2022-12-09 具象有限公司 System and method for interactive multimodal book reading
CN111930667A (en) * 2020-07-09 2020-11-13 上海连尚网络科技有限公司 Method and device for book recommendation in reading application
CN111931062A (en) * 2020-08-28 2020-11-13 腾讯科技(深圳)有限公司 Training method and related device of information recommendation model
US20220253721A1 (en) * 2021-01-30 2022-08-11 Walmart Apollo, Llc Generating recommendations using adversarial counterfactual learning and evaluation
US20220343100A1 (en) * 2021-04-23 2022-10-27 Ping An Technology (Shenzhen) Co., Ltd. Method for cutting video based on text of the video and computing device applying method
CN113642673A (en) * 2021-08-31 2021-11-12 北京字跳网络技术有限公司 Image generation method, device, equipment and storage medium
US20230196000A1 (en) * 2021-12-21 2023-06-22 Woongjin Thinkbig Co., Ltd. System and method for providing personalized book
CN115797488A (en) * 2022-11-28 2023-03-14 科大讯飞股份有限公司 Image generation method and device, electronic equipment and storage medium
CN117216535A (en) * 2023-02-16 2023-12-12 腾讯科技(深圳)有限公司 Training method, device, equipment and medium for recommended text generation model
CN116485948A (en) * 2023-04-30 2023-07-25 上海芯赛云计算科技有限公司 Text image generation method and system based on recommendation algorithm and diffusion model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
QU W 等: ""A novel approach based on multi-view content analysis and semi-supervised enrichment for movie recommendation"", 《JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY》, 30 September 2013 (2013-09-30), pages 776, XP035351298, DOI: 10.1007/s11390-013-1376-7 *
曹斌 等: "基于用户隐性反馈与协同过滤相结合的电子书籍推荐服务", 《小型微型计算机系统?, vol. 38, no. 2, 30 June 2015 (2015-06-30), pages 252 - 255 *

Also Published As

Publication number Publication date
CN117591697B (en) 2024-03-29

Similar Documents

Publication Publication Date Title
JP6431231B1 (en) Imaging system, learning apparatus, and imaging apparatus
Del Molino et al. Summarization of egocentric videos: A comprehensive survey
Newman et al. Multimodal memorability: Modeling effects of semantics and decay on video memorability
CN102207950B (en) Electronic installation and image processing method
CN103988202A (en) Image attractiveness based indexing and searching
CN101783886A (en) Information processing apparatus, information processing method, and program
CN114390217B (en) Video synthesis method, device, computer equipment and storage medium
CN103959227A (en) Life-logging and memory sharing
US10755087B2 (en) Automated image capture based on emotion detection
WO2019196795A1 (en) Video editing method, device and electronic device
KR20190053481A (en) Apparatus and method for user interest information generation
Maybury Multimedia information extraction: Advances in video, audio, and imagery analysis for search, data mining, surveillance and authoring
Sharma et al. Emotion-based music recommendation system
CN115909390B (en) Method, device, computer equipment and storage medium for identifying low-custom content
CN113079420A (en) Video generation method and device, electronic equipment and computer readable storage medium
Jishan et al. Hybrid deep neural network for bangla automated image descriptor
CN117591697B (en) Text recommendation method and system based on artificial intelligence and video processing
US20230066331A1 (en) Method and system for automatically capturing and processing an image of a user
CN114443916A (en) Supply and demand matching method and system for test data
Yang et al. Learning the synthesizability of dynamic texture samples
CN111931510B (en) Intention recognition method and device based on neural network and terminal equipment
Venkatesh et al. “You Tube and I Find”—Personalizing multimedia content access
Hoy Deep learning and online video: Advances in transcription, automated indexing, and manipulation
Vayadande et al. The Rise of AI‐Generated News Videos: A Detailed Review
Shah et al. Video to text summarisation and timestamp generation to detect important events

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant