CN113570687A

CN113570687A - File processing method and device

Info

Publication number: CN113570687A
Application number: CN202110826443.6A
Authority: CN
Inventors: 田野
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-07-21
Filing date: 2021-07-21
Publication date: 2021-10-29

Abstract

The embodiment of the application discloses a file processing method and a file processing device, wherein the method comprises the following steps: displaying text data in a first display page of the media file, and displaying M candidate pictures in a data recommendation area aiming at the text data; m is a positive integer; responding to picture selection operation aiming at a target picture in the M candidate pictures, and displaying N layout pages in the data recommendation area; each layout page comprises a target picture and text data displayed in a first display page of the media file; n is a positive integer; and responding to page selection operation aiming at a target layout page in the N layout pages, and switching and displaying the first display page as the target layout page. By adopting the method and the device, the generation efficiency of the media file can be improved, and the generation time of the media file can be reduced.

Description

File processing method and device

Technical Field

The present application relates to the field of computer technologies, and in particular, to a file processing method and apparatus.

Background

When a user makes a media file, a picture is often inserted into the media file to improve the page richness, the display effect and the like of the media file. That is, in the process of making a media file, a user needs to search for a required picture according to the content of the media file, insert the selected picture into the media file, and perform typesetting again on the media file, so that the inserted picture can adapt to the display of the media file, which causes the user to frequently open a third party application search picture in the process of making the media file, and the processes of searching and searching the picture, typesetting after inserting the picture into the media file, and the like, consume a lot of time, and the typesetting of the media file into which the picture is inserted needs to be continuously adjusted, resulting in low generation efficiency of the media file.

Disclosure of Invention

The embodiment of the application provides a file processing method and device, which can improve the generation efficiency of media files and save the generation time of the media files.

An aspect of an embodiment of the present application provides a file processing method, including:

displaying text data in a first display page of the media file, and displaying M candidate pictures in a data recommendation area aiming at the text data; m is a positive integer;

responding to picture selection operation aiming at a target picture in the M candidate pictures, and displaying N layout pages in the data recommendation area; each layout page comprises a target picture and text data displayed in a first display page of the media file; n is a positive integer;

and responding to page selection operation aiming at a target layout page in the N layout pages, and switching and displaying the first display page as the target layout page.

An aspect of an embodiment of the present application provides a file processing apparatus, where the apparatus includes:

the candidate picture display module is used for displaying text data in a first display page of the media file and displaying M candidate pictures in the data recommendation area aiming at the text data; m is a positive integer;

the layout display module is used for responding to picture selection operation aiming at a target picture in the M candidate pictures and displaying N layout pages in the data recommendation area; each layout page comprises a target picture and text data displayed in a first display page of the media file; n is a positive integer;

and the layout selection module is used for responding to page selection operation aiming at a target layout page in the N layout pages and switching and displaying the first display page as the target layout page.

Wherein, the candidate picture display module comprises:

a text input unit for displaying target text data in the media file in response to a target text data input operation for the media file;

and the candidate picture display unit is used for responding to the picture recommendation operation aiming at the target text data and displaying M candidate pictures related to the target text data in the data recommendation area.

Wherein, the candidate picture display module comprises:

the candidate picture display unit also comprises a display unit for displaying target text data in a first display page of the media file and displaying M candidate pictures related to the target text data in a data recommendation area; alternatively, the first and second electrodes may be,

the candidate picture display unit also comprises a step of displaying target text data in a first display page of the media file, and when a trigger operation aiming at the picture recommendation component is received, displaying M candidate pictures related to the target text data in a data recommendation area; alternatively, the first and second electrodes may be,

the component display unit is used for displaying target text data in a first display page of the media file and displaying the picture recommendation component;

and the component triggering unit is used for responding to triggering operation of the picture recommending component and displaying M candidate pictures associated with the target text data in the data recommending area.

Wherein, the device still includes:

the utilization rate display module is used for responding to the viewing operation aiming at the target picture in the M candidate pictures and displaying the historical utilization rate of the target picture; the historical utilization rate is used to represent the probability that the target picture is selected.

The layout page also comprises a text box to be edited; the device also includes:

and the text box display module is used for displaying a content prompt text in a target text box to be edited of the target format page, responding to the input operation aiming at the target text box to be edited, and switching and displaying the content prompt text in the target text box to be edited into text content corresponding to the input operation.

The media file comprises at least two display pages, wherein the at least two display pages comprise a first display page; the device also includes:

the mark display module is used for carrying out illustration marking on a second display page which has the same semantic information with the first display page in at least two display pages; the callout mark is used for indicating that M candidate pictures are recommended for the second display page.

Wherein, the candidate picture display module comprises:

the text acquisition unit is used for responding to a picture recommendation request aiming at the media file and acquiring text data displayed in a first display page in the media file based on the picture recommendation request;

the keyword extraction unit is used for extracting keywords from the text data to obtain text keywords corresponding to the text data;

and the picture acquisition unit is used for acquiring M candidate pictures matched with the text keywords from the candidate recommendation gallery and displaying the M candidate pictures in the data recommendation area.

Wherein, this keyword extraction unit includes:

a text word segmentation subunit for performing word segmentation processing on the text data to obtain f₁Each word-separating phrase; f. of₁Is a positive integer;

a word frequency obtaining subunit for obtaining f₁The word group frequency corresponding to each word group;

inverse frequency acquisition subunitFor obtaining f₁The inverse document frequency corresponding to each word-segmentation phrase;

an importance determining subunit for determining the importance of the user according to f₁Determining the phrase frequency and the inverse document frequency corresponding to each participle phrase, and determining f₁The word group importance degree corresponding to each word group;

keyword selection subunits for selecting from f based on phrase importance₁And determining text keywords corresponding to the text data in the word segmentation phrases.

Wherein, this word frequency obtains subunit, includes:

a phrase dividing subunit for dividing f₁The same word-separating phrase in each word-separating phrase is divided into f₂A set of phrases; the word segmentation phrases included in each phrase set are the same; f. of₂Is a positive integer;

a word frequency determining subunit, configured to count the number of word groups of the word-dividing word groups included in the ith word group set, and compare the number of word groups corresponding to the ith word group set with the number f₂The sum of the phrase numbers of the participle phrases included in the phrase sets is determined as the phrase frequency corresponding to the participle phrases included in the ith phrase set.

Wherein, the inverse frequency acquisition subunit includes:

the sample word segmentation subunit is used for acquiring at least two sample text data included in the corpus, and performing word segmentation processing on the at least two sample text data respectively to obtain sample word segmentation phrases corresponding to the at least two sample text data respectively;

the correlation statistics subunit is used for determining the number of sample text data correlated with the sample word segmentation word group as the number of correlated texts of the sample word segmentation word group;

and the inverse frequency determining subunit is used for acquiring the total number of sample texts of the at least two sample text data, and determining the inverse document frequency of the sample word segmentation word group according to the total number of the sample texts and the associated text data of the sample word segmentation word group.

Wherein, the device still includes:

a phrase invalid division module for making the frequency of the inverse document less than that of the phraseDetermining the word-dividing phrase with the effective threshold value as an ineffective word-dividing phrase, and f₁The word-separating phrases except the invalid word-separating phrase in the word-separating phrases are marked as valid word-separating phrases;

the keyword selection subunit is specifically configured to:

and determining text keywords corresponding to the text data from the effective word segmentation phrases based on the phrase importance.

Wherein, this text word segmentation subunit includes:

the word graph generating subunit is used for splitting the text data to obtain at least two characters forming the text data, and forming a directed word graph by the at least two characters; at least two characters are nodes of the directed word graph;

the path acquisition subunit is used for acquiring at least two character paths according to the directed word graph and the association degree between adjacent characters in the directed word graph;

the path screening subunit is used for acquiring path lengths corresponding to the at least two character paths respectively, and determining the shortest character path from the at least two character paths according to the path lengths;

a phrase generating subunit, configured to combine the characters corresponding to the shortest character path into a word-segmentation phrase to obtain f₁Each word-separating phrase.

Wherein, this picture acquisition unit includes:

the first vector conversion subunit is used for performing vector conversion on the text keywords to obtain keyword vectors;

the tag matching subunit is used for acquiring at least two associated pictures of which the picture tags are associated with the text keywords from the candidate recommended gallery;

the second vector conversion subunit is used for respectively carrying out picture coding on the at least two associated pictures to obtain picture vectors respectively corresponding to the at least two associated pictures;

the similarity obtaining subunit is used for determining semantic similarity between the at least two associated pictures and the text keywords according to the vector distance between the picture vector and the keyword vector corresponding to the at least two associated pictures respectively;

and the picture selecting subunit is used for acquiring M candidate pictures from the at least two associated pictures based on the semantic similarity.

Wherein, the first vector conversion subunit is specifically configured to:

mapping the text keywords to a target semantic space to obtain a first vector of the text keywords;

performing dimensionality reduction on the first vector to obtain a keyword vector with the target vector length;

the second vector conversion subunit is specifically configured to:

mapping the at least two associated pictures to a target semantic space respectively to obtain second vectors corresponding to the at least two associated pictures respectively;

performing dimensionality reduction processing on the at least two second vectors to obtain picture vectors corresponding to the at least two associated pictures respectively; the length of the picture vector is the target vector length.

Wherein, this format display module includes:

the first quantity acquisition unit is used for responding to picture selection operation aiming at a target picture in the M candidate pictures, acquiring the format selection quantity and the format recommendation proportion, and determining the first format recommendation quantity and the second format recommendation quantity according to the format selection quantity and the format recommendation proportion; the sum of the first edition recommended quantity and the second edition recommended quantity is the edition selection quantity;

the template obtaining unit is used for obtaining a first format template from the cold-start format pool based on the first format recommended quantity and obtaining a second format template from the hot-start format pool based on the second format recommended quantity; the recommended weight value of the second edition template is greater than that of the first edition template;

the layout generation unit is used for combining the first layout template and the second layout template with the target picture respectively to generate N layout pages; n is the number of format choices;

and the layout display unit is used for displaying the N layout pages in the data recommendation area.

Wherein, the device still includes:

the first time updating module is used for updating the target historical recommendation times and the target historical selection times of the target layout page based on page selection operation aiming at the target layout page;

and the weight updating module is used for updating the recommended weight value of the target layout template corresponding to the target layout page according to the target history recommending times and the target history selecting times.

Wherein, the device still includes:

the first weight obtaining module is used for obtaining at least two cold starting format templates included in the cold starting format pool and a first recommended weight value corresponding to each cold starting format template;

the storage updating module is used for adding the cold-starting format template with the first recommended weight value larger than or equal to the weight threshold value into the hot-starting format pool;

the second weight obtaining module is used for obtaining at least two hot start layout templates included in the hot start layout pool and a second recommended weight value corresponding to each hot start layout template;

and the template removing module is used for removing the hot-start layout templates of which the second recommended weight values are smaller than the weight threshold from the hot-start layout pool.

Wherein, the device still includes:

the second-time updating module is used for updating the first historical recommendation times of the first version template if the time that the version page corresponding to the first version template is in the page display state is greater than or equal to the recommended exposure duration threshold;

the utilization rate obtaining module is used for obtaining the format utilization rate of the first format template according to the updated first historical recommendation times and the first historical selection times of the first format template;

and the template deleting module is used for deleting the first layout template in the cold-start layout pool if the layout utilization rate is smaller than the layout retention threshold.

Wherein, this format display module includes:

the layout acquisition unit is used for responding to picture selection operation aiming at a target picture in the M candidate pictures and acquiring N layout templates;

the image adding unit is used for acquiring image display areas respectively included by the N layout templates, adding the target image to the image display areas in the N layout templates and generating N layout pages;

and the page display unit is used for displaying the N layout pages in the data recommendation area.

Wherein, this format display module includes:

a second number acquiring unit configured to acquire, in response to a picture selection operation for a target picture among the M candidate pictures, a number of display texts of text data displayed in a first display page of the media file and a number of display pictures of the file picture;

the target layout acquisition unit is used for acquiring a target layout pool corresponding to the number of the display texts and the number of the display pictures and acquiring N layout templates from the target layout pool;

the page generating unit is used for writing the target picture, the text data and the file picture into the N layout templates to generate N layout pages;

the page display unit is also used for displaying N layout pages in the data recommendation area.

One aspect of the embodiments of the present application provides a computer device, including a processor, a memory, and an input/output interface;

the processor is respectively connected with the memory and the input/output interface, wherein the input/output interface is used for receiving data and outputting data, the memory is used for storing a computer program, and the processor is used for calling the computer program so as to enable the computer device comprising the processor to execute the file processing method in one aspect of the embodiment of the application.

An aspect of the embodiments of the present application provides a computer-readable storage medium, where a computer program is stored, where the computer program is adapted to be loaded and executed by a processor, so as to enable a computer device having the processor to execute the file processing method in the aspect of the embodiments of the present application.

An aspect of an embodiment of the present application provides a computer program product or a computer program, which includes computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method provided in the various alternatives in one aspect of the embodiments of the application.

The embodiment of the application has the following beneficial effects:

in the embodiment of the application, the user equipment can display text data in a first display page of a media file, and for the text data, M candidate pictures are displayed in a data recommendation area; m is a positive integer; responding to picture selection operation aiming at a target picture in the M candidate pictures, and displaying N layout pages in the data recommendation area; each layout page comprises a target picture and text data displayed in a first display page of the media file; n is a positive integer; and responding to page selection operation aiming at a target layout page in the N layout pages, and switching and displaying the first display page as the target layout page. Through the process, the related pictures can be recommended for the media file according to the content in the media file for the user to select, and the user does not need to search for the pictures by himself, so that the cost for obtaining the pictures when the user searches and manufactures the media file can be reduced; when the user selects the target picture to be used, the layout page comprising the target picture is provided for the user based on the target picture, and the layout page is a page which is already typeset, namely, the user can directly select the required target layout page from the layout pages provided by the user equipment, so that the direct typesetting of the page is realized, the typesetting cost of the media file is reduced, the generation efficiency of the media file is further improved, the generation flexibility and the display effect of the media file are improved, and the like.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a diagram of a network interaction architecture for document processing according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a document processing scenario provided in an embodiment of the present application;

FIG. 3 is a flowchart of a method for processing documents according to an embodiment of the present application;

FIG. 4a is a schematic diagram of a picture recommendation scene based on a media file according to an embodiment of the present application;

FIG. 4b is a schematic diagram of a scene of a recommendation of a picture based on an element in a media file according to an embodiment of the present application;

fig. 5 is a schematic diagram of a utilization rate display scenario provided in an embodiment of the present application;

fig. 6 is a schematic view of a progress display scene provided in an embodiment of the present application;

fig. 7 is a flowchart of a specific method of a file processing scenario according to an embodiment of the present application;

fig. 8 is a schematic diagram of a text splitting scene according to an embodiment of the present application;

fig. 9 is a schematic diagram of a distance acquisition scene according to an embodiment of the present application;

fig. 10 is a schematic diagram of a text processing scenario provided in an embodiment of the present application;

fig. 11 is a schematic view of a picture processing scene according to an embodiment of the present application;

FIG. 12 is a schematic diagram of a format recommendation scenario provided by an embodiment of the present application;

FIG. 13 is a flowchart of a format pool updating method provided in an embodiment of the present application;

FIG. 14 is a flowchart of a method for updating recommended weight values according to an embodiment of the present disclosure;

FIG. 15 is a diagram illustrating a layout dictionary tree according to an embodiment of the present application;

FIG. 16 is a diagram illustrating a structure of subject data provided by an embodiment of the present application;

FIG. 17 is a schematic view of a document processing apparatus according to an embodiment of the present application;

fig. 18 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In the embodiment of the present application, please refer to fig. 1, where fig. 1 is a network interaction architecture diagram of file processing provided in the embodiment of the present application, and the embodiment of the present application may be implemented by a user equipment. The media application is an application that can edit or view a file, and can manage media files in one or more media formats, such as document files (e.g., media files in doc or docx formats), video files, or presentation files, where a presentation file (PowerPoint, PPT) is a file that makes static file contents into dynamic browseable files, and makes complex contents into popular and understandable visual information, so that the complex content is more vivid and leaves more impressive slides for people, and the presentation file is a file composed of one or more slides. The media application may be an online media application, or may be a conventional media application, and the like, which is not limited herein. The online media application refers to an application program which can be used by a plurality of users for cooperating and carrying out online sharing or editing on the same media file; a conventional media application may refer to an application that edits media files locally by a user, or the like.

For example, when d pieces of user equipment cooperate based on an online media application, data interaction may be performed between the d pieces of user equipment, and in the d pieces of user equipment, when any one piece of user equipment changes an online media file (for example, adds content, deletes content, or modifies content, etc.), the changed data for the online media file is shared to other pieces of user equipment, and d is a positive integer. The d user devices may perform data interaction through the server, may also perform data interaction directly, or may implement data interaction between different user devices based on the server and direct communication, and the like. As shown in fig. 1, the present application may be implemented by any user device (e.g., a user device 102a, a user device 102b, or a user device 102 c), and the user device may obtain data from a computer device 101, or obtain data based on a cloud storage technology or a block chain technology. That is, the data (such as candidate pictures and layout pages, etc.) referred to in this application may be stored in the computer device 101, may also be stored in a cloud space based on a cloud storage technology, or may be stored in a blockchain network based on a blockchain technology, etc., without being limited herein. Optionally, if the media application is an online media application, the user equipment may update the media file after updating the media file, and then update the updated media file to other user equipment, for example, the user equipment 102a may update the media file after updating the media file, and then update the updated media file to other user equipment (such as the user equipment 102b and the user equipment 102 c) synchronously, where the user equipment that receives the media file synchronized by the user equipment 102a refers to the user equipment that cooperates with the user equipment 102a and can edit the media file.

Specifically, please refer to fig. 2, and fig. 2 is a schematic view of a file processing scenario provided in an embodiment of the present application. As shown in fig. 2, the user equipment 201 may respond to a picture recommendation request for a media file 202, and based on the picture recommendation request, display M candidate pictures 203 in a data recommendation region 2021, where M is a positive integer, where the M candidate pictures 203 are obtained based on the media file 202 and are matched with semantics of the media file 202, so that a picture that is strongly related to content in the media file may be obtained without a user searching for a needed picture, and a retrieval cost (including time, resources, and the like) of the picture is saved. Further, the user equipment 201 may display N layout pages 205 in the data recommendation region 2021 in response to the picture selection operation on the target picture 204 of the M candidate pictures 203, where each layout page includes the target picture 204 and text data displayed in the first display page of the media file, where N is a positive integer, through this process, N layout pages are provided for the user, the user may select a desired layout page on the basis of needs, and may use the layout page directly without performing the layout, so that the content layout of the first display page in the media file is equivalent to one-key generation, and when the user equipment 201 responds to the page selection operation on the target layout page 206 of the N layout pages 205, the first display page may be switched and displayed as the target layout page 206, which saves the media file generation process, the time and resources consumed in the processes of searching and searching the pictures and typesetting after the pictures are inserted into the media files are saved, the generation cost of the media files is saved, the generation efficiency of the media files is improved, and the flexibility and the display effect of the generation of the media files are improved.

When the media application program is an online media application, if d user equipment simultaneously edits a media file, a situation that a plurality of user equipment insert pictures in a first display page may occur, according to the scheme in the application, the d user equipment can display M candidate pictures associated with text data displayed in the media file, so that the selection ranges of the pictures required to be inserted in the first display page by the d user equipment are the same, time consumed by each user equipment for respectively searching the pictures is reduced, the pictures searched by different user equipment have large difference, changing opinions of the first display page are different, picture insertion and page composition cannot be performed on the first display page in real time, and efficiency of the first display page is reduced. By recommending the candidate pictures in the same range for each user equipment, the selection ranges of the user equipment are consistent, the possibility that the user equipment selects different pictures is reduced, and the editing efficiency of the media file is improved. Optionally, if each user equipment selects a target picture from the M candidate pictures, the server may take the target picture with the minimum trigger time of the corresponding picture selection operation as a picture to be inserted into the first display page based on the time sequence of receiving the picture selection operation corresponding to each user equipment, and generate N layout pages according to the target picture with the minimum trigger time, where each user equipment may display the N layout pages in the data recommendation area. For example, assuming that d is 3, the trigger time of the picture selection operation of the user equipment 1 on the target picture 1 is trigger time 1, the trigger time of the picture selection operation of the user equipment 2 on the target picture 2 is trigger time 2, the trigger time of the picture selection operation of the user equipment 3 on the target picture 3 is trigger time 3, and assuming that the trigger time 1 is less than the trigger time 3 and the trigger time 3 is less than the trigger time 2, N layout pages are generated according to the target picture 1 corresponding to the trigger time 1. Alternatively, when receiving a picture selection operation for a target picture in the M candidate pictures, in an operation receiving period, d target pictures selected by the user equipment (where there may be target pictures that are not selected by the user equipment) may be acquired, and N layout pages and the like may be generated according to the target picture with the largest number of selections, which is not limited herein. The operation receiving period is to avoid a situation that the picture inserting process for the media file cannot be continuously executed when the user equipment does not insert the picture into the first display page.

It is understood that the user equipment (e.g., the first user equipment or the second user equipment, etc.) mentioned in this embodiment of the present application may be a computer device, and the computer device in this embodiment of the present application includes, but is not limited to, a terminal device or a server. In other words, the computer device may be a server or a terminal device, or may be a system of a server and a terminal device. The above-mentioned terminal device may be an electronic device, including but not limited to a mobile phone, a tablet computer, a desktop computer, a notebook computer, a palm computer, a vehicle-mounted device, an Augmented Reality/Virtual Reality (AR/VR) device, a helmet display, a smart television, a wearable device, a smart speaker, a digital camera, a camera, and other Mobile Internet Devices (MID) with network access capability, or a terminal device in a scene such as a train, a ship, or a flight, and the like. The above-mentioned server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, middleware service, domain name service, security service, a vehicle, a Content Delivery Network (CDN), big data, an artificial intelligence platform, and the like.

Optionally, the data related to the embodiment of the present application may be stored in a computer device, or the data related to the present application may be stored based on a cloud storage technology or a block chain technology, which is not limited herein.

Further, please refer to fig. 3, wherein fig. 3 is a flowchart of a method for processing a document according to an embodiment of the present application. As shown in fig. 3, the file processing procedure includes the following steps:

step S301, displaying text data in a first display page of the media file, and displaying M candidate pictures in a data recommendation area according to the text data.

In the embodiment of the application, the user equipment may display text data in a first display page of a media file, acquire M candidate pictures associated with the text data in the media file, and display the M candidate pictures in a data recommendation area, where M is a positive integer. Optionally, the media file mentioned in the present application may be a document file, a video file, a presentation file, or the like; the media file may be an online media file or a local media file, where the online media file refers to a media file that can be edited in cooperation with other users, and the local media file may refer to a media file that is edited by the user himself and needs to be sent to other users, so that the other users edit the local media file. Optionally, the user equipment may display a picture recommendation component, where the picture recommendation component is associated with a media file, and in response to a trigger operation for the picture recommendation component, obtain M candidate pictures associated with the media file, and display the M candidate pictures in the data recommendation area, where the picture recommendation component may be denoted as a first picture recommendation component, where the first picture recommendation component is a picture recommendation component associated with the entire media file, and optionally, the user equipment may obtain a first display page of the media file, and obtain M candidate pictures associated with the first display page, where the first display page refers to a page on which the media file is being displayed, that is, a page on which a user can directly see page content.

Specifically, the user equipment may generate a picture recommendation request for a media file in a media application based on a triggering operation of a picture recommendation component in the media file. The picture recommendation component can be located at any position in the media application program, and the picture recommendation component can have different triggering modes.

For example, refer to fig. 4a to 4b, and fig. 4a to 4b are schematic diagrams of a picture recommendation scene provided in an embodiment of the present application. As shown in fig. 4 a-4 b, the media application 401 may include, but is not limited to, application areas such as an edit component area 402, a thumbnail display area 403, and a file display area 404, wherein the relative positions of the application areas displayed in the media application 401 are not limited. When the media file is managed in the media application 401, the media file may be considered to be associated with an application area, and the related data of the media file may be displayed through the application area. As shown in fig. 4a, fig. 4a is a schematic diagram of a picture recommendation scene based on a media file according to an embodiment of the present application, where a user equipment may display a picture recommendation component in an editing component area 402 of a media application 401; alternatively, a picture recommendation component or the like may be displayed in the application function component list associated with any one of the file editing components (e.g., file components, editing components, etc.) displayed in the editing component area 402. Taking fig. 4a as an example, the user equipment may display M candidate pictures in the data recommendation area 405 in response to a trigger operation for the picture recommendation component 4011. The user equipment responds to a trigger operation for the picture recommendation component 4011, and can acquire text data in a first display page, acquire M candidate pictures associated with the text data, and display the M candidate pictures in the data recommendation area 405; alternatively, target text data in the text data displayed on the first display page may be acquired, M candidate pictures associated with the target text data may be acquired, and the M candidate pictures may be displayed in the data recommendation region 405. That is, the user equipment may recommend the candidate picture for the text data in the entire page of the first display page, or may recommend the candidate picture for a part of the target text data in the first display page (i.e., one or at least two of the text data displayed by the first display page). The user equipment may always display the data recommendation area 405 in the media application 401, or may display the data recommendation area 405 in the media application 401 when the candidate picture needs to be displayed, which is not limited herein. Optionally, the data recommendation area 405 may be displayed in the same page as the first display page, or may be independently displayed on the first display page, and the like, which is not limited herein.

For example, the user device may make a picture recommendation for an element in the first display page. Specifically, the user equipment may respond to an input operation for target text data of the media file, and display the target text data in the media file; and in response to the picture recommendation operation aiming at the target text data, displaying M candidate pictures associated with the target text data in the data recommendation area. The input operation of the target text data and the picture recommendation operation aiming at the target text data can be adjacently triggered, namely, the picture recommendation operation aiming at the target text data is triggered after the input operation of the target text data is triggered; or not adjacently triggered, that is, after triggering the input operation of the target text data, other operations may be performed, and then the picture recommendation operation for the target text data is triggered. Specifically, the user equipment may display the target text data in a first display page of the media file, and display M candidate pictures associated with the target text data in the data recommendation region, that is, when the target text data is displayed in the first display page, a picture recommendation operation for the target text data may be triggered, and the M candidate pictures associated with the target text data may be acquired. Alternatively, the user device may display the target text data in a first display page of the media file, and when a trigger operation for a picture recommendation component (such as the picture recommendation component shown in fig. 4 a) is received, display M candidate pictures associated with the target text data in the data recommendation region. Or the user equipment can display the target text data in a first display page of the media file, and respond to the trigger operation aiming at the target text data to display the picture recommending component; and in response to a triggering operation of the picture recommendation component, displaying M candidate pictures associated with the target text data in the data recommendation area.

For example, please refer to fig. 4b, fig. 4b is a schematic diagram of a picture recommendation scene based on elements in a media file according to an embodiment of the present application. As shown in fig. 4b, the user equipment responds to the input operation of the target text data of the media file, and displays the target text data in the media file, and optionally, a target text box 4041 may be displayed in the file display area 404, and in response to the input operation for the target text box 4041, obtains the target text data "care for the elderly" input in the target text box 4041, and displays the target text data in the media file. Further, the user device displays the picture recommendation component 4042 in response to a trigger operation for the target text data "care for the elderly". In response to a triggering operation for the picture recommendation component 4042, M candidate pictures 406 associated with the target text data "care of the elderly" are displayed in the data recommendation area 405.

Optionally, the user equipment may display, in response to the viewing operation for the target picture in the M candidate pictures, a historical utilization rate of the target picture, where the historical utilization rate is used to indicate a probability that the target picture is selected, that is, a probability that the target picture is selected by the user after being recommended to the user. The user equipment can respond to the viewing operation of any one candidate picture in the M candidate pictures and display the historical utilization rate of the candidate picture corresponding to the viewing operation.

For example, please refer to fig. 5, wherein fig. 5 is a schematic diagram of a utilization rate display scenario provided in an embodiment of the present application. As shown in fig. 5, the user equipment 501 may display M candidate pictures 502 in the data recommendation area, and in response to a viewing operation for a candidate picture 5021 in the M candidate pictures 502, display the historical utilization rate of the candidate picture 5021. Assuming that the historical utilization rate of the candidate picture 5021 is 63%, the user equipment 501 may display the historical utilization rate of the candidate picture 5021 in the data recommendation area, or display the historical utilization rate of the candidate picture 5021 in the utilization rate display area, where the utilization rate display area is independently displayed in the data recommendation area, and the like, without limitation. For example, in fig. 5, in the first display mode, the historical utilization rate "63%" of the candidate picture 5021 is displayed in the utilization rate display area 503, and the utilization rate display area 503 is associated with the candidate picture 5021; in the second display mode, the historical utilization rate "63%" of the candidate picture 5021 is displayed in the data recommendation area, as shown by the area 504, the area 504 is associated with the candidate picture 5021, and the user equipment 501 may display the area 504 in an area adjacent to the area where the candidate picture 5021 is displayed, for example.

Further, responding to a picture recommendation request aiming at the media file, and acquiring text data displayed in a first display page in the media file based on the picture recommendation request; extracting keywords from the text data to obtain text keywords corresponding to the text data; and acquiring M candidate pictures matched with the text keywords from the candidate recommendation gallery, and displaying the M candidate pictures in the data recommendation area.

Step S302, responding to picture selection operation aiming at a target picture in the M candidate pictures, and displaying N layout pages in the data recommendation area.

In the embodiment of the application, in response to a picture selection operation for a target picture in M candidate pictures, N layout templates associated with the target picture are obtained, the target picture and text data displayed in a first display page are written into the N layout templates, N layout pages are generated, and the N layout pages are displayed in a data recommendation area. Optionally, the user equipment may obtain the number of display texts of text data displayed in the first display page of the media file, the number of display pictures of the file picture, and the like, and obtain N layout templates based on the number of display texts and the number of display pictures; or the user equipment may obtain N layout templates and the like from the hot-start layout pool, the cold-start layout pool and the like based on the layout recommendation ratio, and the N layout templates and the like are not displayed here.

Optionally, when the user equipment generates N layout pages, a generation progress message of each layout page may be displayed. Specifically, the user equipment responds to picture selection operation aiming at a target picture in a media file and displays a generation progress message of each layout page in a data recommendation area; if the generation of the layout page fails (the layout page with the generation failure can be recorded as an abnormal layout page), a generation failure prompt message of the abnormal layout page can be displayed in the data recommendation area, or the abnormal layout page can be directly deleted; if the layout page is successfully generated (the successfully generated layout page can be recorded as a normal layout page), the user equipment can display the normal layout page in the data recommendation area, and optionally, the user equipment can also display a message for prompting the successful generation of the normal layout page.

Specifically, please refer to fig. 6, where fig. 6 is a schematic view of a schedule display scene provided in an embodiment of the present application. As shown in fig. 6, the user equipment 601 may obtain M layout templates in response to a picture selection operation for a target picture 6021 in the M candidate pictures 602, and generate M layout pages based on the M layout templates, the target picture, and the first display page. Alternatively, the user apparatus 601 may display generation progress messages of respective layout pages, for example, a generation progress message 6031 ("4%") of the layout page 1, a generation progress message 6032 ("5%") of the layout page 2, a generation progress message 6033 ("3%") of the layout page 3, and the like, in the data recommendation region 603. Further, the user device 601 may display the generation result of each layout page, for example, layout page 1 generation is successful, and the user device 601 displays the generated layout page 1 (i.e., the content indicated by the area 6041) in the data recommendation area 603; the layout page 2 is successfully generated, and the user equipment 601 displays the generated layout page 2 (i.e., the content indicated by the area 6042) in the data recommendation area 603; the layout page 3 fails to be generated, and the user equipment 601 displays a generation failure prompt message 6043 of the layout page 3 in the data recommendation area, and the like.

Step S303, responding to page selection operation aiming at a target layout page in the N layout pages, and switching and displaying the first display page as the target layout page.

Furthermore, the user equipment can respond to page selection operation aiming at a target layout page in the N layout pages, switch and display the first display page as the target layout page, realize picture insertion and page typesetting of the first display page, and improve the generation efficiency of the page. Further optionally, the layout page may further include a text box to be edited, that is, the target layout page includes a target text box to be edited. The user equipment can display the content prompt text in the target text box to be edited of the target layout page, respond to the input operation aiming at the target text box to be edited, and switch and display the content prompt text in the target text box to be edited into the text content corresponding to the input operation. For example, the content prompt text displayed in the target text box to be edited is "speaker information", and in response to the input operation for the target text box to be edited, the content prompt text "speaker information" in the target text box to be edited is switched and displayed to the text content corresponding to the input operation, such as "speaker 1".

In the embodiment of the application, the user equipment can respond to the picture recommendation request aiming at the media file and display M candidate pictures in the data recommendation area; m is a positive integer; responding to picture selection operation aiming at a target picture in the M candidate pictures, and displaying N layout pages in the data recommendation area; each layout page comprises a target picture and text data displayed in a first display page of the media file; n is a positive integer; and responding to page selection operation aiming at a target layout page in the N layout pages, and switching and displaying the first display page as the target layout page. Through the process, the related pictures can be recommended for the media file according to the content in the media file for the user to select, and the user does not need to search for the pictures by himself, so that the cost for obtaining the pictures when the user searches and manufactures the media file can be reduced; when the user selects the target picture to be used, the layout page comprising the target picture is provided for the user based on the target picture, and the layout page is a page which is already typeset, namely, the user can directly select the required target layout page from the layout pages provided by the user equipment, so that the direct typesetting of the page is realized, the typesetting cost of the media file is reduced, the generation efficiency of the media file is further improved, the generation flexibility and the display effect of the media file are improved, and the like.

Further, referring to fig. 7, fig. 7 is a flowchart of a specific method of a file processing scenario provided in an embodiment of the present application. As shown in fig. 7, the method includes the steps of:

step S701, obtains text data in the media file.

In this embodiment, the user equipment may display text data in a first display page of a media file, and obtain the text data in the media file. Optionally, the user equipment may obtain text data displayed in a first display page of the media file, as shown in fig. 4 a; or, the user equipment may obtain target text data corresponding to the picture recommendation request, as shown in fig. 4b, which is not limited herein. Specifically, the candidate picture recommendation method is determined by an object that needs to recommend a candidate picture, for example, the user equipment may respond to a picture recommendation request for a media file, and obtain text data displayed in a first display page of the media file; or, when the user equipment displays the target text data in the first display page of the media file, the user equipment obtains the target text data and executes step S702; or, when the user equipment displays the target text data in the first display page of the media file and receives a trigger operation for the picture recommendation component, the user equipment acquires the target text data and executes step S702, and optionally, the target text data may be in a selected state; alternatively, when the target text data is displayed in the first display page of the media file, the user equipment may display the picture recommendation component in response to the trigger operation for the target text data, acquire the target text data in response to the trigger operation for the picture recommendation component, and execute step S702, specifically refer to the specific description in step S301 of fig. 3.

Step S702, extracting text keywords in the text data.

In the embodiment of the application, the user equipment can perform word segmentation processing on the text data to obtain f₁Each word-separating phrase; f. of₁Is a positive integer; from f₁And obtaining text keywords from the word segmentation phrases. Optionally, the user equipment may delete f₁Obtaining text keywords by nonsense word-separating phrases in the word-separating phrases, wherein a dictionary comprising the word-separating phrases without practical meaning can be established in advance, the dictionary can comprise one or at least two nonsense word-separating phrases, and the user equipment can delete f₁The meaningless word segmentation phrases belong to the dictionary, wherein the meaningless word segmentation phrases refer to phrases without actual meanings or without actual influence on the semanteme of the sentence, such as auxiliary words and the like; alternatively, the user equipment may obtain f₁The word group importance degree corresponding to each word group is determined based on the word group importance degree, and the text key words are determined based on the word group importance degree. Specifically, the user equipment may perform word segmentation processing on the text data to obtain f₁Each word-separating phrase; f. of₁Is a positive integer; obtaining f₁Obtaining the phrase frequency corresponding to each word-dividing phrase to obtain f₁The inverse document frequency corresponding to each word-dividing phrase is determined according to f₁Determining the phrase frequency and the inverse document frequency corresponding to each participle phrase, and determining f₁The word group importance degree corresponding to each word group; from f based on phrase importance₁And determining text keywords corresponding to the text data in the word segmentation phrases. The user equipment may perform word segmentation processing on the text data by using a text word segmentation algorithm, where the text word segmentation algorithm includes, but is not limited to, an algorithm corresponding to a text word segmentation tool or a shortest path word segmentation algorithm. For example, the user equipment may perform operations on text dataSplitting to obtain at least two characters forming text data, and forming a directed word graph by the at least two characters; at least two characters are nodes of the directed word graph. Obtaining at least two character paths according to the directed word graph and the association degree between adjacent characters in the directed word graph; acquiring path lengths corresponding to the at least two character paths respectively, and determining the shortest character path from the at least two character paths according to the path lengths; composing the characters corresponding to the shortest character path into word-segmentation phrases to obtain f₁Each word-separating phrase.

For example, please refer to fig. 8, and fig. 8 is a schematic diagram of a text splitting scene according to an embodiment of the present application. As shown in fig. 8, the user equipment may divide the text data into at least two characters, for example, split the text data "what he said really is" into at least two characters, such as "he", "say", "certain", "true", "at", and "reason", and compose the at least two characters into a directed word graph, which may be a directed acyclic graph, wherein the weight of each character in the directed word graph may be considered to be equal, for example, the weight of each edge in the directed word graph may be recorded as 1 or other default weight values, and the like. When the shortest path of the directed acyclic graph is obtained, the shortest path between two points in the directed acyclic graph also includes the shortest path between other vertices on the path, for example, assume S —>A—>B—>E is S to E to the shortest path, that S->A—>B must be S to B to shortest path, otherwise there will be a point C such that d (S-)>C—>B) Less than d (S-)>A—>B) That S to E shortest path will also become S —>C—>B—>And E, contradicting the assumption, and acquiring the shortest path in the directed word graph by using the property of the optimal substructure. Where d (, x) is used to denote the path length of "", which can be derived from the weight of the "" -corresponding edge. Optionally, the user equipment may obtain at least two character paths according to the directed word graph and the association degree between adjacent characters in the directed word graph, for example, "true, ideal, and the like" in fig. 8, adopt a path analysis algorithm to obtain path lengths corresponding to the respective character paths, and obtain the path lengths corresponding to the respective character paths from the at least two characters based on the path lengthsAnd determining the shortest path in the path, and forming word segmentation phrases such as 'true, ideal and the like' by using characters corresponding to the shortest path. Optionally, the path analysis algorithm may be a greedy algorithm, or may be other algorithms that can analyze the path, which is not limited herein. For example, the user device performs word segmentation processing on the text data "nursing of the elderly", and obtains that "the elderly" forms a shortest path and "the nursing" forms a shortest path, so that the user device can perform word segmentation processing on the text data "nursing of the elderly", and obtain f such as "the elderly", "nursing", and the like₁Each word-separating phrase. Further, the user equipment may delete f₁And obtaining text keywords ' old people ' and ' nursing ' corresponding to the text data by using the meaningless word segmentation phrase ' in the word segmentation phrases.

Further optionally, at the time of obtaining f₁When the word group frequency corresponding to each word group is determined, the user equipment can use f₁The same word-separating phrase in each word-separating phrase is divided into f₂A set of phrases; the word segmentation phrases included in each phrase set are the same; f. of₂Is a positive integer. For example, f₁The phrase includes "the elderly, nursing, eating and nourishing", etc., and f can be divided into four groups₁Dividing word groups into "old people", "nursing, nursing", "diet" and "recuperation", etc₂A set of phrases. Optionally, obtaining the phrase set is only for conveniently counting the occurrence times of each participle phrase, or may not count f₁The word groups are divided, and the times of occurrence of each word group are directly counted, which is not limited herein. Further, the phrase number of the word segmentation phrases included in the ith phrase set is counted, and the phrase number corresponding to the ith phrase set and the f number are compared₂The sum of the phrase numbers of the participle phrases included in the phrase sets is determined as the phrase frequency corresponding to the participle phrases included in the ith phrase set. That is, the Term Frequency (TF) may indicate the number of times that the corresponding participle Term appears in the text data, and may generally indicate that TF is a participle Term appearing in the text dataThe number of phrases of/the total number of participled phrases of the text data. The importance of a word segmentation group in the text data generally increases in proportion to the number of occurrences of the word segmentation group in the text data, but decreases in inverse proportion to the frequency of occurrences of the word segmentation group in the corpus. For example, for f obtained as described above₁The separate word phrases "old people", "care", and "nursing" can obtain 1/3 TF for each separate word phrase, but obviously, the picture recommendation using "is not practical, so the abnormal influence of" is eliminated by Inverse Document Frequency (IDF).

Specifically, at the time of obtaining f₁When the inverse document frequency corresponding to each word segmentation phrase is respectively obtained, the user equipment can obtain at least two sample text data included in the corpus, and carry out word segmentation processing on the at least two sample text data respectively to obtain sample word segmentation phrases corresponding to the at least two sample text data respectively; determining the number of sample text data associated with the sample word segmentation word group as the number of associated texts of the sample word segmentation word group; and obtaining the total number of sample texts of at least two sample text data, and determining the inverse document frequency of the sample word segmentation word group according to the total number of the sample texts and the associated text data of the sample word segmentation word group. The IDF is an index for expressing the importance of a word, and the user equipment may use "IDF ═ log" (total number of sample texts in the corpus/(associated text data +1 including the participle phrase)) ", where if all sample text data include a participle phrase, the IDF ═ log (1) ═ 0 of the participle phrase may be considered, that is, the importance of the phrase is 0, and optionally, in order to avoid the denominator being 0," +1 "in the denominator, the IDF may be obtained by the above formula; the above formula may be used when the associated text data is 0, and when the associated text data is not 0, "IDF ═ log (total number of sample texts in the corpus/associated text data including the word-segmentation phrase)" may be used to obtain the associated text data, which is not limited herein.

Further optionally, as the frequency of the inverse document is smaller, the times of the word segmentation phrases appearing in the corpus is larger, and the word segmentation phrases have actual meaningsThe smaller the likelihood of (a), and therefore, word-segmented phrases that appear in each sample text data at a very high frequency, such as "of" can be culled based on the inverse document frequency. Specifically, the user equipment may determine the word segmentation phrase with the inverse document frequency less than the phrase effective threshold as an invalid word segmentation phrase, and set f₁And the word segmentation phrases except the invalid word segmentation phrase in the word segmentation phrases are marked as valid word segmentation phrases. Further, based on the phrase importance degree from f₁In the word segmentation phrase, when determining the text keyword corresponding to the text data, the user equipment may determine the text keyword corresponding to the text data from the valid word segmentation phrase based on the phrase importance.

Optionally, the user equipment may determine the phrase importance of the segmented phrase based on the phrase frequency and the inverse document frequency of the segmented phrase, where the user equipment may use a product of the phrase frequency and the inverse document frequency of the segmented phrase as the phrase importance of the segmented phrase, or may use a weighted product of the phrase frequency and the inverse document frequency of the segmented phrase as the phrase importance of the segmented phrase, and the like, which is not limited herein. Optionally, if a candidate picture is recommended for the target text data, a text keyword of the target text data may be extracted, and step S702 is executed for the target text data.

Step S703 is to acquire M candidate pictures matched with the text keywords from the candidate recommendation gallery, and display the M candidate pictures in the data recommendation area.

In this embodiment of the application, the user equipment may obtain M candidate pictures matched with the text keywords from the candidate recommended gallery, and display the M candidate pictures in the data recommendation region, optionally, when the candidate pictures are added to the candidate recommended gallery, a picture tag may be added to the candidate pictures, and the user equipment may obtain M candidate pictures matched with the text keywords from the candidate recommended gallery. Further, the user equipment can perform vector conversion on the text keywords to obtain keyword vectors; acquiring at least two associated pictures of the picture labels and the text keywords from the candidate recommended picture library, and respectively carrying out picture coding on the at least two associated pictures to obtain picture vectors respectively corresponding to the at least two associated pictures; determining semantic similarity between the at least two associated pictures and the text keywords respectively according to the vector distance between the picture vector and the keyword vector respectively corresponding to the at least two associated pictures; and acquiring M candidate pictures from at least two associated pictures based on the semantic similarity.

For example, please refer to fig. 9, and fig. 9 is a schematic view of a distance obtaining scene according to an embodiment of the present application. As shown in fig. 9, the user equipment may obtain an associated picture 901, and perform picture coding on the associated picture 901 to obtain a picture vector of the associated picture 901; and carrying out vector conversion on the text keywords to obtain keyword vectors of the text keywords. Distance measurement is performed on the picture vectors and the keyword vectors of the candidate picture 901 to obtain a vector distance between the picture vectors and the keyword vectors of the associated picture 901. The vector distance may be an euclidean distance or a cosine distance, where the euclidean distance is a distance between two points, and the cosine distance is a distance obtained according to a cosine value of an included angle between two vectors.

Further, the user equipment can map the text keywords to a target semantic space to obtain a first vector of the text keywords; and performing dimensionality reduction on the first vector to obtain a keyword vector with the target vector length. Mapping the at least two associated pictures to a target semantic space respectively to obtain second vectors corresponding to the at least two associated pictures respectively; performing dimensionality reduction processing on the at least two second vectors to obtain picture vectors corresponding to the at least two associated pictures respectively; the length of the picture vector is the target vector length. That is, both the picture (i.e. the associated picture) and the text (i.e. the text keyword) are mapped to the same semantic space, and the semantic similarity between the associated picture and the text keyword can be represented by the vector distance between the picture vector and the keyword vector.

For example, please refer to fig. 10, fig. 10 is a schematic diagram of a text processing scenario provided in the embodiment of the present application. As shown in fig. 10, the user equipment acquires text data "nursing for the elderly", and performs word segmentation processing on the text dataTo obtain f₁Individual word-separating phrase "old person, nursing", from₁Determining a text keyword 'elder care', mapping the text keyword to a target semantic space, performing vector conversion to obtain a first vector of the text keyword, assuming that the first vector is '0000.1200000.560.2300 …', assuming that the vector length of the first vector is 30000, performing dimension reduction processing on the first vector to obtain a keyword vector with a target vector length, assuming that the target vector length is 1000, and performing dimension reduction processing on the first vector to obtain a keyword vector '0.12-0.330.15-0.54-0.010.040.31 …'. The user equipment may determine that the text keyword is converted into a sparse first Vector according to the phrase frequency and the inverse document frequency of the text keyword, for example, the text keyword may be subjected to Vector conversion by using a text Vector conversion Model to obtain the first Vector, where the text Vector conversion Model may be, but is not limited to, a Bag of words Model (Bag of words), a Vector Space Model (Vector Space Model), and the like. Optionally, the user equipment may perform a dimension reduction process on the first vector by using a dimension reduction model, which may include, but is not limited to, a Principal Component Analysis (PCA) model, Singular Value Decomposition (SVD), Linear Discriminant Analysis (LDA), or the like, and is not limited herein.

For example, please refer to fig. 11, fig. 11 is a schematic diagram of a picture processing scene according to an embodiment of the present disclosure. As shown in fig. 11, the user equipment acquires an associated picture 1101, maps the associated picture to a target semantic space, and obtains a second vector 1104 corresponding to the associated picture 1101, where the vector conversion process may be implemented based on a picture vector conversion model 1102. Further, the user equipment may perform dimension reduction processing on the second vector 1104 by using a full connection layer, so as to obtain a picture vector 1105 with a target vector length. Optionally, the picture vector transformation model 1102 includes a convolutional neural network 1103, the convolutional neural network 1103 may extract semantic information of the associated picture, for example, the convolutional neural network 1103 may use a 3 × 3 convolutional kernel, employing 16 network layers, where a packet is includedThe convolutional layer includes 13 convolutional layers, etc., but is not limited thereto, that is, the size of the convolutional core used by the convolutional neural network 1103, the number of included network layers, the number of convolutional layers, etc. may be changed according to actual needs. Optionally, the convolutional neural network 1103 may further include a maximum pooling layer, which may reduce the size of the associated picture, as shown in fig. 11, the maximum pooling layer is used to perform maximum pooling on the first matrix 1106 to obtain a second matrix 1107, where the maximum pooling is performed by using 2 × 2 size and 2 as a step size (the size and the step size of the maximum pooling may be changed as needed), that is, the maximum pooling size and the step size in the first matrix 1106 are changed as needed, that is, the maximum pooling layer may be used to reduce the size of the associated picture, that is, the maximum pooling layer is performed on the first matrix 1106

Performing maximal pooling to obtain 9 in the second matrix 1107, and performing maximal pooling on 9 in the first matrix 1106

Maximum pooling results in 5 of the second matrix 1107, for 5 of the first matrix 1106

Maximum pooling results in 6 of the second matrix 1107, for 6 of the first matrix 1106

The maximum pooling is performed to obtain 8 in the second matrix 1107, and based on this principle, the scale of the related picture is converted by the maximum pooling layer.

For example, the convolutional neural network may be composed of two or three convolutional layers with a largest pooling layer, for example, a 224 × 224 associated picture is input, the structure of a convolutional neural network can be shown in table 1, where table 1 is the structure of the convolutional neural network and the scale variation table of the picture, as shown below:

TABLE 1

And the number related to the full connection layer is used for representing the dimensionality of a vector obtained after the related picture is processed, so that the length of the obtained picture vector is the length of the target vector.

Further, the user equipment may obtain M candidate pictures from the at least two associated pictures based on the semantic similarity. Specifically, the user equipment may directly display M candidate pictures in the data recommendation area; alternatively, the M candidate pictures may be sequentially displayed in the data recommendation region based on the semantic similarities corresponding to the M candidate pictures, that is, the M candidate pictures may be sequentially displayed in the data recommendation region according to the sequence of the semantic similarities from large to small, and the like, which is not limited herein. Optionally, since the recommended area size of the data recommended area may be a case where the M candidate pictures cannot be displayed, the first sliding control may be displayed in the data recommended area, and the user equipment may respond to the sliding operation for the first sliding control to display the M candidate pictures for the user.

Specifically, referring to fig. 12, fig. 12 is a schematic diagram of a format recommendation scene provided in an embodiment of the present application. As shown in fig. 12, the user device 1201 responds to a picture recommendation request for a media file, obtains text data in the media file, such as text data 1203 (care of the elderly) displayed in a first display page 1202 of the media file, extracts a text keyword "elderly, care" in the text data 1203, obtains M candidate pictures matching the text keyword, and displays the M candidate pictures 1205 in the data recommendation area 1204, optionally, a first sliding control may be displayed in the data recommendation area 1204, and when responding to a trigger operation for the first sliding control, the user device 1201 may update the candidate pictures in a page display state in the data recommendation area 1204, where the page display state means that the candidate pictures can be seen by a user.

Step S704, responding to picture selection operation aiming at a target picture in the M candidate pictures, acquiring N layout templates, generating N layout pages based on the N layout templates, and displaying the N layout pages in the data recommendation area.

In the embodiment of the application, the user equipment can respond to the picture selection operation aiming at the target picture in the M candidate pictures to obtain N layout templates; acquiring picture display areas respectively included by the N layout templates, adding a target picture to the picture display areas in the N layout templates, and generating N layout pages; and displaying N layout pages in the data recommendation area. Wherein, the number of the target pictures can be one or at least two.

Optionally, as shown in fig. 12, the user equipment 1201 may obtain N layout templates in response to a picture selection operation for the target picture 1206, write the target picture and text data in each layout template, generate N layout pages, and may display the N layout pages 1207 in the data recommendation area 1204. Optionally, the user device 1201 may also display the target picture 1206 in the first display page 1202, which is an optional process. Optionally, the user device 1201 may further display a second sliding control in the data recommendation area 1204, and when a trigger operation for the second sliding control is responded, the layout page in the page display state in the data recommendation area 1204 may be updated.

Specifically, the user equipment can respond to picture selection operation aiming at a target picture in the M candidate pictures, obtain the format selection quantity and the format recommendation ratio, and determine the first format recommendation quantity and the second format recommendation quantity according to the format selection quantity and the format recommendation ratio; and the sum of the recommended quantity of the first format and the recommended quantity of the second format is the format selection quantity. For example, the number of the format selection is N, and the recommended format proportion is that the proportion of the cold-start format pool to the hot-start format pool is 1: and 9, determining that the recommended quantity of the first format is N x 10%, and the recommended quantity of the second format is N x 90%. The hot-start layout template included in the hot-start layout pool is a high-quality layout template with a large recommended weight value, and the recommended weight value of the cold-start layout template included in the cold-start layout pool is small. Further, the user equipment can acquire a first layout template from the cold-start layout pool based on the first layout recommendation quantity, and acquire a second layout template from the hot-start layout pool based on the second layout recommendation quantity; the recommended weight value of the second edition template is greater than that of the first edition template; combining the first format template and the second format template with the target picture respectively to generate N format pages; n is the number of format choices; and displaying N layout pages in the data recommendation area.

Optionally, the user equipment may update the target historical recommendation times and the target historical selection times of the target layout page based on the page selection operation for the target layout page, that is, if the target layout page is selected, it indicates that the recommendation times and the selection times of the target layout template corresponding to the target layout page are both increased, and may update the target historical recommendation times and the target historical selection times of the target layout template corresponding to the target layout page. Updating a recommended weight value of a target layout template corresponding to the target layout page according to the target history recommendation times and the target history selection times, wherein optionally, the smaller the ratio of the target history selection times to the target history recommendation times, the smaller the recommended weight value of the target layout template is, that is, the ratio of the target history selection times to the target history recommendation times can be directly obtained, the ratio is used as the target history utilization rate of the target layout template, and the target history utilization rate is determined as the recommended weight value of the target layout template; alternatively, the target history selection times and the target history recommendation times may be weighted, a ratio of the weighted target history selection times to the weighted target history recommendation times is obtained, and the ratio is determined as a recommendation weight value of the target layout template, and the like, which is not limited herein.

Optionally, the user equipment may obtain at least two cold-starting format templates included in the cold-starting format pool and a first recommended weight value corresponding to each cold-starting format template; and adding the cold-starting layout template with the first recommended weight value larger than or equal to the weight threshold value into the hot-starting layout pool. That is, if the first recommended weight value of the cold-start layout template in the cold-start layout pool is greater than or equal to the weight threshold, indicating that the cold-start layout template belongs to the high-quality template, the cold-start layout template whose first recommended weight value is greater than or equal to the weight threshold may be added to the hot-start layout pool, and the cold-start layout template whose first recommended weight value is greater than or equal to the weight threshold may be removed from the cold-start layout pool. Further, at least two hot start layout templates included in the hot start layout pool and a second recommended weight value corresponding to each hot start layout template can be obtained; and removing the hot-start layout templates with the second recommended weight value smaller than the weight threshold from the hot-start layout pool. Optionally, the hot-start layout template with the second recommended weight value smaller than the weight threshold may be directly deleted; the hot-start layout template with the second recommended weight value smaller than the weight threshold value can also be added into the cold-start layout pool, and the hot-start layout template with the second recommended weight value smaller than the weight threshold value in the hot-start layout pool is deleted.

And if the time of the layout page corresponding to the first layout template in the page display state is greater than or equal to the recommended exposure duration threshold, updating the first historical recommendation times of the first layout template. The layout page corresponding to the layout template has a page display state and a page hiding state, when the layout page is in the page hiding state, the layout page cannot be seen by a user, namely, in N layout pages displayed in the data recommendation area, the layout page in the page display state can be seen by the user, and the layout page in the page hiding state cannot be seen by the user, and page selection operation cannot be triggered. The accuracy of determining the recommended weight value of the layout template is improved by counting the time of the layout page corresponding to the layout template in the page display state. Further, the format utilization rate of the first format template is obtained according to the updated first historical recommendation times and the updated first historical selection times of the first format template. Optionally, if the layout page corresponding to the first layout template is triggered to perform the page selection operation, the first historical selection times of the first layout template may also be updated; and obtaining the format utilization rate of the first format template according to the updated first historical recommendation times and the updated first historical selection times. And if the format utilization rate is smaller than the format retention threshold, deleting the first format template in the cold-start format pool.

Optionally, the media file includes at least two display pages, and the at least two display pages include the first display page. Carrying out illustration marking on a second display page which has the same semantic information with the first display page in at least two display pages; the callout mark is used for indicating that M candidate pictures are recommended for the second display page. The user can directly select the pictures of the second display page with the same semantic information without performing semantic analysis on the second display page again, and the pictures are recommended, so that the flexibility and the efficiency of recommending the pictures are improved.

Optionally, the user equipment may obtain a display position of the first display page in the media file, where the display position includes a cover position, a chapter position, a text position, an end position, and the like, and obtain N layout templates based on the display position of the first display page. For example, at the cover location, a picture will typically be the background; at the text position, the picture is generally used as text content for explanation and the like, so that the recommended format template can be further accurately carried out through the display position of the first display page, and the accuracy of format recommendation can be improved.

Optionally, the user equipment may obtain, in response to the picture selection operation for the target picture in the M candidate pictures, the number of display texts of the text data displayed in the first display page of the media file and the number of display pictures of the file picture. The method includes the steps of obtaining a target layout pool corresponding to the number of display texts and the number of display pictures, and obtaining N layout templates from the target layout pool, specifically obtaining the target layout pools corresponding to the number of the display texts and the number of the display pictures and the number of the target pictures, that is, there may be a plurality of layout pools, where the number of texts corresponding to each layout pool is different from the number of the pictures, for example, the number of the display texts is 1, the number of the display pictures is 1, and the number of the target pictures is 2, so that the user equipment may obtain the target layout pool with the number of texts being 1 and the number of the pictures being 3. Writing the target picture, the text data and the file picture into N layout templates to generate N layout pages, namely writing the target picture, the text data and the file picture into each layout template to generate N layout pages; and displaying N layout pages in the data recommendation area.

Optionally, the format pool may also be divided by the number of texts and the number of pictures, cold start and hot start, and the like, which is not limited herein.

Step S705, responding to page selection operation aiming at a target layout page in the N layout pages, and switching and displaying the first display page as the target layout page.

In the embodiment of the present application, as shown in fig. 12, in response to a page selection operation for a target layout page 1208 in N layout pages 1207, the first display page 1202 is switched and displayed as the target layout page 1208, so that the layout of each element included in the first display page 1202 is realized. Optionally, the user equipment may cancel displaying the data recommendation area 1204 when displaying the target layout page 1208, or may still display the data recommendation area 1204.

Optionally, when the user generates the third display page, the user equipment may respond to the format extraction request for the third display page, convert the third display page into a third format template, and add the third format template to the cold-start format pool. Optionally, the user equipment may respond to the format extraction request for the third display page, display a sharing attribute selection message, convert the third display page into a third format template when responding to a confirmation operation for the sharing attribute selection message, and add the third format template to the format pool (e.g., the cold-start format pool); when a cancel operation is performed in response to the shared attribute selection message, the third display page may be converted to a third layout template, and the third layout template may be stored in the user device (i.e., locally).

Further, referring to fig. 13, fig. 13 is a flowchart of a format pool updating method provided by an embodiment of the present application. As shown in fig. 13, the method includes the steps of:

step S130a, extracting a sample layout template from sample media files in an existing corpus.

In the embodiment of the application, the sample layout template can be extracted from the sample media files in the existing corpus.

Step S130b, add the sample layout template to the cold-start layout pool.

In step S130d, N layout templates are obtained.

In the embodiment of the present application, when a picture selection operation for a target picture is responded, N layout templates are obtained from the cold-start layout pool and the hot-start layout pool 130c, which is specifically described in step S704 in fig. 7.

In step S130e, a forward feedback is performed on the selected layout template.

In the embodiment of the application, the selected layout template is fed back in the forward direction, that is, the target layout template corresponding to the target layout page triggered by the page selection operation in the N layout templates is fed back in the forward direction, that is, the recommended weight value of the target layout template is increased.

For a cold-start layout template in the cold-start layout pool, please refer to the following steps:

and step S130g, whether the recommended selection is made.

In this embodiment of the application, whether the cold-start layout template in the cold-start layout pool is selected after being recommended is detected, that is, whether the first layout template belonging to the cold-start layout pool in the recommended N layout templates is selected, if yes, step S130h is executed, and if not, step S130i is executed.

And step S130h, updating the recommended weight value of the cold-starting layout template, and determining the storage position of the cold-starting layout template based on the recommended weight value.

In this embodiment of the application, if a layout page corresponding to a first layout template is triggered to perform a page selection operation, the first historical selection times of the first layout template may be updated, and the recommended weight value of the first layout template may be increased. If the recommended weight value of the first edition template is greater than or equal to the weight threshold value, adding the first edition template into a hot-start edition pool; if the recommended weight value of the first version template is less than the weight threshold, the first version template is still stored in the cold-start version pool.

And step S130i, processing the cold-start layout template based on the layout utilization rate.

In the embodiment of the application, the format utilization rate of the first format template can be obtained, and if the format utilization rate of the first format template is smaller than a format retention threshold, the first format template in the cold-start format pool is deleted; and if the format utilization rate of the first format template is greater than or equal to the format retention threshold, not processing the first format template.

The method comprises the following steps of aiming at a hot start format pool:

step S130k, after being recommended, updates the recommendation weight value based on the user feedback.

In this embodiment of the application, after the hot-start layout template in the hot-start layout pool is recommended, that is, a second layout template in the N layout templates may update the recommended weight value based on user feedback, that is, it is detected whether a layout page corresponding to the second layout template is triggered to perform a page selection operation, a second historical selection frequency of the second layout template is updated based on the detection result, and the recommended weight value of the second layout template is determined based on the second historical recommendation frequency and the second historical selection frequency of the second layout template.

In step S130l, it is recommended whether the weight value is lower than the weight threshold.

In this embodiment of the present application, it is detected whether the recommended weight of the hot-start layout template in the hot-start layout pool is lower than a weight threshold, and if the recommended weight is lower than the weight threshold, step S130m is executed.

In step S130m, the hot-start layout templates below the weight threshold are removed from the hot-start layout pool.

In this embodiment of the application, the hot-start layout template whose recommended weight value is smaller than the weight threshold may be removed from the hot-start layout pool, and optionally, the removed hot-start layout template may also be added to the cold-start layout pool.

Further, referring to fig. 14, fig. 14 is a flowchart of a method for updating a recommended weight value according to an embodiment of the present application. As shown in fig. 14, the method includes the steps of:

step S140a, a sample layout template is extracted from sample text data of the corpus.

Step S140b, a layout pool is generated.

In an embodiment of the present application, a layout pool may be generated based on a sample layout template, and the layout pool may include a hot-start layout pool and a cold-start layout pool, where the cold-start layout pool is used to store a newly extracted layout template. Optionally, the hot-start format pool and the cold-start format pool may be divided into a plurality of format pools based on the number of texts and the number of pictures, that is, the hot-start format pool may include one or at least two hot-start format pools, and the number of texts and the number of pictures corresponding to each hot-start format pool are different from each other; the cold-start layout pool may include one or at least two cold-start layout pools, and the number of texts and the number of pictures corresponding to each cold-start layout pool are different from each other. Optionally, there may also be one or at least two layout pools, where the text number and the picture number respectively corresponding to each layout pool are different from each other, that is, the text number and the picture number of the layout pool 1 are different from the text number and the picture number of the layout pool 2, and each layout pool may include a hot-promoter layout pool and a cold-promoter layout pool.

In step S140c, the text data is edited.

In the embodiment of the present application, reference may be made to the specific description shown in step S701 in fig. 7, which is not described herein again.

Step S140d, semantic analysis.

In the embodiment of the present application, reference may be made to the specific description shown in step S702 in fig. 7, and details are not repeated here.

In step S140e, M candidate pictures are recommended based on the semantic analysis result.

In the embodiment of the present application, reference may be made to the specific description shown in step S703 in fig. 7, which is not described herein again.

Step S140f, in response to the picture selection operation for the target picture.

And step S140g, opening typesetting recommendation.

Step S140h, acquiring N layout templates from the layout pool.

In the embodiment of the present application, reference may be made to the detailed description shown in step S704 in fig. 7 for step S140f, step S140g, and step S140h, which are not described herein again. The user equipment can generate N layout pages according to the N layout templates, and the N layout pages are displayed in the data recommendation area.

Step S140i, in response to the page selection operation for the target layout page, applies the target layout page.

In the embodiment of the present application, reference may be made to the specific description shown in step S705 in fig. 7, and details are not repeated here.

And step S140j, performing weighted feedback on the layout template.

In the embodiment of the application, the N layout templates are subjected to weighted feedback based on page selection operation of the target layout page.

In step S140k, the recommended weight values of the N layout templates are updated.

In the embodiment of the application, historical recommendation times and historical selection times of the N layout templates are updated based on page selection operation of a target layout page, and recommendation weighted values of the N layout templates are updated according to the historical recommendation times and the historical selection times of the layout templates.

Further, referring to fig. 15, fig. 15 is a schematic diagram of a layout dictionary tree provided in an embodiment of the present application, and as shown in fig. 15, the layout dictionary tree is a dictionary tree constructed based on the number of texts and the number of pictures. For example, assuming that the first display page includes a document picture and two pieces of text data, the method may trigger to enter a layout pool C of a "single-image and dual-text layout pool" indicated in the layout dictionary tree shown in fig. 15, and acquire N layout templates from the "single-image and dual-text layout pool". Specifically, when a target picture is inserted into a first display page, the target picture is searched, the first display page can include the situations that text data exists in a file picture or other contents do not exist, and if the first display page does not have other contents, N layout templates can be obtained from a single-picture layout pool; if the first display page has text data and no other content, N layout templates can be obtained from the single-image single-text layout pool; if two sections of text data exist in the first display page, N layout templates can be obtained from the single-image double-text layout pool; if a file picture and a section of text data exist in the first display page, N layout templates can be obtained from the double-picture single text layout pool; if one file picture exists in the first display page and no other content exists on the basis, N layout templates and the like can be obtained from the double-picture text-free layout pool.

Referring to fig. 16, fig. 16 is a schematic diagram of a subject data structure provided in an embodiment of the present application. As shown in fig. 16, assuming that the media file is a Presentation file, the theme tree corresponding to the media file may be a tree composed of Presentation nodes (Presentation) as root nodes, and the theme tree includes one or more style nodes, such as Master (Master) nodes, format nodes, file page nodes, remark Master nodes, remark file page nodes, lecture Master nodes, theme (topics) nodes, Code (Code) nodes, Presentation file attribute nodes, View attributes (View attributes) nodes, font nodes, and the like. The method comprises the steps that a root node and a style node in a theme tree and edges between the style node and the style node are used for representing the incidence relation between the nodes connected by the edges.

Further, please refer to fig. 17, wherein fig. 17 is a schematic diagram of a document processing apparatus according to an embodiment of the present application. The file processing apparatus may be a computer program (including program code, etc.) running in a computer device, for example, the file processing apparatus may be an application software; the apparatus may be used to perform the corresponding steps in the methods provided by the embodiments of the present application. As shown in fig. 17, the file processing apparatus 1700 may be used in the user equipment in the embodiment corresponding to fig. 3, and specifically, the apparatus may include: a candidate picture display module 11, a layout display module 12 and a layout selection module 13.

The candidate picture display module 11 is configured to display text data in a first display page of the media file, and for the text data, display M candidate pictures in the data recommendation area; m is a positive integer;

the layout display module 12 is configured to display N layout pages in the data recommendation area in response to a picture selection operation for a target picture in the M candidate pictures; each layout page comprises a target picture and text data displayed in a first display page of the media file; n is a positive integer;

and the layout selection module 13 is configured to respond to a page selection operation for a target layout page in the N layout pages, and switch and display the first display page to the target layout page.

The candidate picture display module 11 includes:

a text input unit 111 for displaying target text data in a media file in response to a target text data input operation for the media file;

and a candidate picture display unit 112, configured to display M candidate pictures associated with the target text data in the data recommendation area in response to a picture recommendation operation for the target text data.

The candidate picture display module 11 includes:

the candidate picture display unit 112 further includes displaying target text data in a first display page of the media file, and displaying M candidate pictures associated with the target text data in the data recommendation area; alternatively, the first and second electrodes may be,

the candidate picture display unit 112 further includes displaying target text data in a first display page of the media file, and when a trigger operation for the picture recommendation component is received, displaying M candidate pictures associated with the target text data in the data recommendation area; alternatively, the first and second electrodes may be,

a component display unit 113 configured to display a picture recommendation component in response to a trigger operation for the target text data;

a component triggering unit 114, configured to display M candidate pictures associated with the target text data in the data recommendation area in response to a triggering operation for the picture recommendation component.

Wherein, the apparatus 1700 further comprises:

a utilization rate display module 14, configured to respond to a viewing operation for a target picture in the M candidate pictures, and display a historical utilization rate of the target picture; the historical utilization rate is used to represent the probability that the target picture is selected.

and the text box display module 15 is configured to display a content prompt text in a target text box to be edited of the target layout page, respond to an input operation for the target text box to be edited, and switch and display the content prompt text in the target text box to be edited into text content corresponding to the input operation.

The media file comprises at least two display pages, wherein the at least two display pages comprise a first display page; the apparatus 1700 further comprises:

the mark display module 16 is configured to perform illustration marking on a second display page, which has the same semantic information as the first display page, of the at least two display pages; the callout mark is used for indicating that M candidate pictures are recommended for the second display page.

The candidate picture display module 11 includes:

a text obtaining unit 115, configured to, in response to a picture recommendation request for a media file, obtain text data displayed in a first display page in the media file based on the picture recommendation request;

a keyword extraction unit 116, configured to perform keyword extraction on the text data to obtain a text keyword corresponding to the text data;

and the picture acquisition unit 117 is configured to acquire M candidate pictures matching the text keyword from the candidate recommendation gallery, and display the M candidate pictures in the data recommendation area.

The keyword extraction unit 116 includes:

a text word segmentation subunit 1161, configured to perform word segmentation processing on the text data to obtain f₁Each word-separating phrase; f. of₁Is a positive integer;

a word frequency obtaining subunit 1162 configured to obtain f₁The word group frequency corresponding to each word group;

an inverse frequency obtaining subunit 1163 for obtaining f₁The inverse document frequency corresponding to each word-segmentation phrase;

an importance determining subunit 1164 for determining the importance according to f₁Determining the phrase frequency and the inverse document frequency corresponding to each participle phrase, and determining f₁The word group importance degree corresponding to each word group;

a keyword selection subunit 1165 for selecting words from f based on the phrase importance₁And determining text keywords corresponding to the text data in the word segmentation phrases.

The word frequency obtaining subunit 1162 includes:

a phrase dividing subunit 111a for dividing f₁The same word-separating phrase in each word-separating phrase is divided into f₂A set of phrases; the word segmentation phrases included in each phrase set are the same; f. of₂Is a positive integer;

a word frequency determining subunit 111b, configured to count the number of word groups of the word-separating word groups included in the ith word group set, and compare the number of word groups corresponding to the ith word group set with the number f₂The word group of the word-separating word group respectively included in the word group setAnd the sum of the quantity is determined as the phrase frequency corresponding to the word segmentation phrases included in the ith phrase set.

The inverse frequency obtaining subunit 1163 includes:

the sample word segmentation subunit 112a is configured to obtain at least two sample text data included in the corpus, and perform word segmentation processing on the at least two sample text data respectively to obtain sample word segmentation phrases corresponding to the at least two sample text data respectively;

the correlation statistics subunit 112b is configured to determine the number of sample text data associated with the sample word segmentation word group as the number of associated texts of the sample word segmentation word group;

the inverse frequency determining subunit 112c is configured to obtain a total number of sample texts of the at least two sample text data, and determine an inverse document frequency of the sample word segmentation word group according to the total number of sample texts and the associated text data of the sample word segmentation word group.

Wherein, the apparatus 1700 further comprises:

a phrase invalid division module 17, configured to determine the word-segmented phrases with the inverse document frequency less than the phrase valid threshold as invalid word-segmented phrases, and determine f₁The word-separating phrases except the invalid word-separating phrase in the word-separating phrases are marked as valid word-separating phrases;

the keyword selection subunit 1165 is specifically configured to:

The text word segmentation subunit 1161 includes:

the word graph generating subunit 113a is configured to split the text data to obtain at least two characters that form the text data, and form a directed word graph from the at least two characters; at least two characters are nodes of the directed word graph;

the path obtaining subunit 113b is configured to obtain at least two character paths according to the directed word graph and the association degree between adjacent characters in the directed word graph;

the path screening subunit 113c is configured to obtain path lengths corresponding to the at least two character paths, and determine a shortest character path from the at least two character paths according to the path lengths;

a phrase generating subunit 113d, configured to combine the characters corresponding to the shortest character path into a word-segmentation phrase, to obtain f₁Each word-separating phrase.

The picture acquiring unit 117 includes:

a first vector conversion subunit 1171, configured to perform vector conversion on the text keyword to obtain a keyword vector;

a tag matching subunit 1172, configured to obtain at least two associated pictures with associated picture tags and text keywords from the candidate recommended gallery;

a second vector conversion subunit 1173, configured to perform picture coding on the at least two associated pictures respectively, so as to obtain picture vectors corresponding to the at least two associated pictures respectively;

a similarity obtaining subunit 1174, configured to determine semantic similarities between the text keywords and the at least two associated pictures respectively according to vector distances between the picture vectors and the keyword vectors respectively corresponding to the at least two associated pictures;

a picture selection sub-unit 1175 for obtaining M candidate pictures from the at least two associated pictures based on the semantic similarity.

The first vector conversion subunit 1171 is specifically configured to:

the second vector conversion subunit 1173 is specifically configured to:

Wherein, the format display module 12 includes:

a first quantity obtaining unit 12a, configured to obtain, in response to a picture selection operation for a target picture in the M candidate pictures, a layout selection quantity and a layout recommendation ratio, and determine a first layout recommendation quantity and a second layout recommendation quantity according to the layout selection quantity and the layout recommendation ratio; the sum of the first edition recommended quantity and the second edition recommended quantity is the edition selection quantity;

the template obtaining unit 12b is configured to obtain a first layout template from the cold-start layout pool based on the first recommended number of layouts, and obtain a second layout template from the hot-start layout pool based on the second recommended number of layouts; the recommended weight value of the second edition template is greater than that of the first edition template;

the layout generating unit 12c is configured to combine the first layout template and the second layout template with the target picture, respectively, to generate N layout pages; n is the number of format choices;

and a layout display unit 12d for displaying the N layout pages in the data recommendation area.

Wherein, the device still includes:

the first time updating module 18 is configured to update the target historical recommendation times and the target historical selection times of the target layout page based on the page selection operation for the target layout page;

and the weight updating module 19 is configured to update the recommended weight value of the target layout template corresponding to the target layout page according to the target history recommendation frequency and the target history selection frequency.

Wherein, the apparatus 1700 further comprises:

a first weight obtaining module 20, configured to obtain at least two cold-start layout templates included in the cold-start layout pool and a first recommended weight value corresponding to each cold-start layout template;

the storage updating module 21 is configured to add a cold-start layout template with a first recommended weight value greater than or equal to a weight threshold to the hot-start layout pool;

the second weight obtaining module 22 is configured to obtain at least two hot start layout templates included in the hot start layout pool and a second recommended weight value corresponding to each hot start layout template;

and the template removing module 23 is configured to remove the hot-start layout template with the second recommended weight value smaller than the weight threshold from the hot-start layout pool.

Wherein, the apparatus 1700 further comprises:

the second-time updating module 24 is configured to update the first historical recommendation times of the first version template if the time that the version page corresponding to the first version template is in the page display state is greater than or equal to the recommended exposure duration threshold;

a utilization rate obtaining module 25, configured to obtain a format utilization rate of the first format template according to the updated first historical recommendation times and the first historical selection times of the first format template;

and the template deleting module 26 is configured to delete the first layout template in the cold-start layout pool if the layout utilization is smaller than the layout retention threshold.

Wherein, the format display module 12 includes:

a layout obtaining unit 12e, configured to obtain N layout templates in response to a picture selection operation for a target picture in the M candidate pictures;

the picture adding unit 12f is configured to acquire picture display areas included in the N layout templates, add the target picture to the picture display areas in the N layout templates, and generate N layout pages;

and the page display unit 12g is used for displaying the N layout pages in the data recommendation area.

Wherein, the format display module 12 includes:

a second number acquiring unit 12h configured to acquire, in response to a picture selection operation for a target picture among the M candidate pictures, a number of display texts of text data displayed in a first display page of the media file and a number of display pictures of the file picture;

a target layout obtaining unit 12i, configured to obtain a target layout pool corresponding to the number of display texts and the number of display pictures, and obtain N layout templates from the target layout pool;

the page generating unit 12j is configured to write the target picture, the text data, and the file picture into the N layout templates, and generate N layout pages;

the page display unit 12g is further configured to display N layout pages in the data recommendation area.

The embodiment of the application provides a file processing device, which can run in user equipment, wherein the user equipment can display text data in a first display page of a media file in a media application program, and M candidate pictures are displayed in a data recommendation area according to the text data; m is a positive integer; responding to picture selection operation aiming at a target picture in the M candidate pictures, and displaying N layout pages in the data recommendation area; each layout page comprises a target picture and text data displayed in a first display page of the media file; n is a positive integer; and responding to page selection operation aiming at a target layout page in the N layout pages, and switching and displaying the first display page as the target layout page. Through the process, the related pictures can be recommended for the media file according to the content in the media file for the user to select, and the user does not need to search for the pictures by himself, so that the cost for obtaining the pictures when the user searches and manufactures the media file can be reduced; when the user selects the target picture to be used, the layout page comprising the target picture is provided for the user based on the target picture, and the layout page is a page which is already typeset, namely, the user can directly select the required target layout page from the layout pages provided by the user equipment, so that the direct typesetting of the page is realized, the typesetting cost of the media file is reduced, the generation efficiency of the media file is further improved, the generation flexibility and the display effect of the media file are improved, and the like.

Referring to fig. 18, fig. 18 is a schematic structural diagram of a computer device according to an embodiment of the present application. As shown in fig. 18, the computer device in the embodiment of the present application may include: one or more processors 1801, memory 1802, and input-output interface 1803. The processor 1801, memory 1802, and input/output interface 1803 are connected by a bus 1804. The memory 1802 is configured to store a computer program comprising program instructions, and the input/output interface 1803 is configured to receive data and output data, such as for data interaction between a user equipment and a server; the processor 1801 is configured to execute program instructions stored by the memory 1802.

The processor 1801 may perform the following operations:

In some possible embodiments, the processor 1801 may be a Central Processing Unit (CPU), and the processor may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), field-programmable gate arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and so on. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 1802, which may include both read-only memory and random-access memory, provides instructions and data to the processor 1801 and the input/output interface 1803. A portion of memory 1802 may also include non-volatile random access memory. For example, memory 1802 may also store device type information.

In a specific implementation, the computer device may execute the implementation manners provided in the steps in fig. 3 through each built-in functional module thereof, which may specifically refer to the implementation manners provided in the steps in fig. 3, and details are not described herein again.

The embodiment of the present application provides a computer device, including: the system comprises a processor, an input/output interface and a memory, wherein the processor acquires a computer program in the memory, executes each step of the method shown in the figure 3 and carries out file processing operation. According to the embodiment of the application, the relevant pictures are recommended for the media file according to the content in the media file for the user to select, the user does not need to search for the pictures, and the cost for obtaining the pictures when the user retrieves and makes the media file can be reduced; when the user selects the target picture to be used, the layout page comprising the target picture is provided for the user based on the target picture, and the layout page is a page which is already typeset, namely, the user can directly select the required target layout page from the layout pages provided by the user equipment, so that the direct typesetting of the page is realized, the typesetting cost of the media file is reduced, the generation efficiency of the media file is further improved, the generation flexibility and the display effect of the media file are improved, and the like.

An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored, where the computer program is suitable for being loaded by the processor and executing the file processing method provided in each step in fig. 3, and for details, reference may be made to implementation manners provided in each step in fig. 3, and details are not described here again. In addition, the beneficial effects of the same method are not described in detail. For technical details not disclosed in embodiments of the computer-readable storage medium referred to in the present application, reference is made to the description of embodiments of the method of the present application. By way of example, a computer program can be deployed to be executed on one computer device or on multiple computer devices at one site or distributed across multiple sites and interconnected by a communication network.

The computer readable storage medium may be the file processing apparatus provided in any of the foregoing embodiments or an internal storage unit of the computer device, such as a hard disk or a memory of the computer device. The computer readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Memory Card (SMC), a Secure Digital (SD) card, a flash card (flash card), and the like, provided on the computer device. Further, the computer-readable storage medium may also include both an internal storage unit and an external storage device of the computer device. The computer-readable storage medium is used for storing the computer program and other programs and data required by the computer device. The computer readable storage medium may also be used to temporarily store data that has been output or is to be output.

Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instruction from the computer-readable storage medium, and executes the computer instruction, so that the computer device executes the method provided in the various optional modes in fig. 3, and the recommendation of the related pictures for the media file is realized according to the content in the media file, so that the selection of the user is realized, and the user does not need to search for the pictures by himself, so that the cost for obtaining the pictures when the user retrieves and makes the media file can be reduced; when the user selects the target picture to be used, the layout page comprising the target picture is provided for the user based on the target picture, and the layout page is a page which is already typeset, namely, the user can directly select the required target layout page from the layout pages provided by the user equipment, so that the direct typesetting of the page is realized, the typesetting cost of the media file is reduced, the generation efficiency of the media file is further improved, the generation flexibility and the display effect of the media file are improved, and the like.

The terms "first," "second," and the like in the description and in the claims and drawings of the embodiments of the present application are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "comprises" and any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, apparatus, product, or apparatus that comprises a list of steps or elements is not limited to the listed steps or modules, but may alternatively include other steps or modules not listed or inherent to such process, method, apparatus, product, or apparatus.

Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the specification for the purpose of clearly illustrating the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The method and the related apparatus provided by the embodiments of the present application are described with reference to the flowchart and/or the structural diagram of the method provided by the embodiments of the present application, and each flow and/or block of the flowchart and/or the structural diagram of the method, and the combination of the flow and/or block in the flowchart and/or the block diagram can be specifically implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable document processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable document processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block or blocks of the block diagram. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable document processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block or blocks of the block diagram. These computer program instructions may also be loaded onto a computer or other programmable document processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The steps in the method of the embodiment of the application can be sequentially adjusted, combined and deleted according to actual needs.

The modules in the device can be merged, divided and deleted according to actual needs.

The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto, and all equivalent variations and modifications can be made to the present application.

Claims

1. A method of file processing, the method comprising:

displaying text data in a first display page of a media file, and displaying M candidate pictures in a data recommendation area aiming at the text data; m is a positive integer;

responding to picture selection operation aiming at a target picture in the M candidate pictures, and displaying N layout pages in the data recommendation area; each layout page includes the target picture and the text data displayed in a first display page of the media file; n is a positive integer;

2. The method of claim 1, wherein displaying text data in a first display page of a media file for which M candidate pictures are displayed in a data recommendation area comprises:

responding to an input operation aiming at target text data of a media file, and displaying the target text data in the media file;

and in response to the picture recommendation operation aiming at the target text data, displaying M candidate pictures associated with the target text data in a data recommendation area.

3. The method of claim 1, wherein displaying text data in a first display page of a media file for which M candidate pictures associated with the target text data are displayed in a data recommendation area comprises:

displaying target text data in a first display page of a media file, and displaying M candidate pictures associated with the target text data in a data recommendation area; alternatively, the first and second electrodes may be,

displaying target text data in a first display page of a media file, and displaying M candidate pictures associated with the target text data in a data recommendation area when a trigger operation for a picture recommendation component is received; alternatively, the first and second electrodes may be,

displaying target text data in a first display page of a media file, responding to trigger operation aiming at the target text data, and displaying a picture recommendation component; and in response to a triggering operation of the picture recommendation component, displaying M candidate pictures associated with the target text data in a data recommendation area.

4. The method of claim 1, wherein the method further comprises:

responding to the viewing operation aiming at the target picture in the M candidate pictures, and displaying the historical utilization rate of the target picture; the historical utilization rate is used for representing the probability of the target picture being selected.

5. The method of claim 1, wherein the layout page further comprises a text box to be edited; the method further comprises the following steps:

displaying a content prompt text in a target text box to be edited of the target layout page, responding to an input operation aiming at the target text box to be edited, and switching and displaying the content prompt text in the target text box to be edited into text content corresponding to the input operation.

6. The method of claim 1, wherein the media file comprises at least two display pages, the at least two display pages comprising the first display page; the method further comprises the following steps:

carrying out illustration marking on a second display page which has the same semantic information with the first display page in the at least two display pages; the illustration mark is used for representing that the M candidate pictures are recommended for the second display page.

7. The method of claim 1, wherein displaying M candidate pictures in a data recommendation region in response to a picture recommendation request for a media file comprises:

responding to a picture recommendation request aiming at a media file, and acquiring text data displayed in the first display page in the media file based on the picture recommendation request;

extracting keywords from the text data to obtain text keywords corresponding to the text data;

and acquiring M candidate pictures matched with the text keywords from a candidate recommendation gallery, and displaying the M candidate pictures in a data recommendation area.

8. The method of claim 7, wherein the extracting the keywords from the text data to obtain the text keywords corresponding to the text data comprises:

performing word segmentation processing on the text data to obtain f₁Each word-separating phrase; f. of₁Is a positive integer;

obtaining the f₁Obtaining the phrase frequency corresponding to each word-dividing phrase to obtain the f₁The inverse document frequency corresponding to each word-dividing phrase is determined according to the f₁Determining the phrase frequency and the inverse document frequency corresponding to each participle phrase, and determining f₁The word group importance degree corresponding to each word group;

from the f based on the phrase importance₁And determining text keywords corresponding to the text data in the word segmentation phrases.

9. The method of claim 8, wherein said obtaining said f₁The phrase frequency corresponding to each word-dividing phrase respectively comprises:

will f is₁The same word-separating phrase in each word-separating phrase is divided into f₂A set of phrases; the word segmentation phrases included in each phrase set are the same; f. of₂Is a positive integer;

counting the phrase number of word-dividing phrases in the ith phrase set, and comparing the phrase number corresponding to the ith phrase set with the f₂The sum of the phrase numbers of the participle phrases respectively included in the phrase sets is determined as the phrase frequency corresponding to the participle phrases included in the ith phrase set.

10. The method of claim 8, wherein said obtaining said f₁The inverse document frequency corresponding to each word segmentation phrase respectively comprises the following steps:

acquiring at least two sample text data included in a corpus, and performing word segmentation processing on the at least two sample text data respectively to obtain sample word segmentation phrases corresponding to the at least two sample text data respectively;

determining the number of sample text data associated with the sample word segmentation word group as the number of associated texts of the sample word segmentation word group;

and obtaining the total number of sample texts of the at least two sample text data, and determining the inverse document frequency of the sample word segmentation word group according to the total number of the sample texts and the associated text data of the sample word segmentation word group.

11. The method of claim 8, wherein the method further comprises:

determining the word segmentation phrase with the inverse document frequency less than the phrase effective threshold value as an invalid word segmentation phrase, and determining the f₁The word-separating phrases except the invalid word-separating phrase in the word-separating phrases are marked as valid word-separating phrases;

the phrase importance degree is based on the f₁In each word segmentation phrase, determining a text keyword corresponding to the text data, including:

12. The method of claim 8, wherein said tokenizing said text data results in f₁A word-separating phrase comprising:

splitting the text data to obtain at least two characters forming the text data, and forming a directed word graph by the at least two characters; the at least two characters are nodes of the directed word graph;

obtaining at least two character paths according to the directed word graph and the association degree between adjacent characters in the directed word graph;

acquiring path lengths corresponding to the at least two character paths respectively, and determining a shortest character path from the at least two character paths according to the path lengths;

composing the characters corresponding to the shortest character path into word-segmentation phrases to obtain f₁Each word-separating phrase.

13. The method of claim 7, wherein the obtaining M candidate pictures matching the text keyword from the candidate recommendation gallery comprises:

performing vector conversion on the text keywords to obtain keyword vectors;

acquiring at least two associated pictures of picture labels and the text keywords from a candidate recommended picture library, and respectively carrying out picture coding on the at least two associated pictures to obtain picture vectors respectively corresponding to the at least two associated pictures;

determining semantic similarity between the at least two associated pictures and the text keywords according to the vector distance between the picture vector corresponding to the at least two associated pictures and the keyword vector;

and acquiring M candidate pictures from the at least two associated pictures based on the semantic similarity.

14. The method of claim 13, wherein said vector converting said text keywords to obtain a keyword vector comprises:

performing dimensionality reduction on the first vector to obtain a keyword vector with a target vector length;

the picture coding is performed on the at least two associated pictures respectively to obtain picture vectors corresponding to the at least two associated pictures respectively, and the picture coding includes:

mapping the at least two associated pictures to the target semantic space respectively to obtain second vectors corresponding to the at least two associated pictures respectively;

15. The method of claim 1, wherein the displaying N layout pages in the data recommendation region in response to a picture selection operation for a target picture of the M candidate pictures comprises:

responding to picture selection operation aiming at a target picture in the M candidate pictures, acquiring format selection quantity and format recommendation proportion, and determining first format recommendation quantity and second format recommendation quantity according to the format selection quantity and the format recommendation proportion; the sum of the first edition recommended quantity and the second edition recommended quantity is the edition selection quantity;

acquiring a first layout template from a cold-start layout pool based on the first layout recommended quantity, and acquiring a second layout template from a hot-start layout pool based on the second layout recommended quantity; the recommended weight value of the second edition template is greater than that of the first edition template;

combining the first layout template and the second layout template with the target picture respectively to generate N layout pages; n is the format selection number;

and displaying the N layout pages in the data recommendation area.

16. The method of claim 15, wherein the method further comprises:

acquiring at least two cold starting format templates included in the cold starting format pool and a first recommended weight value corresponding to each cold starting format template;

adding a cold-start layout template with the first recommended weight value being greater than or equal to a weight threshold value into the hot-start layout pool;

acquiring at least two hot start layout templates included in the hot start layout pool and a second recommended weight value corresponding to each hot start layout template;

removing, from the pool of warm-boot versions, warm-boot version templates for which the second recommended weight value is less than the weight threshold.

17. The method of claim 15, wherein the method further comprises:

if the time that the layout page corresponding to the first layout template is in the page display state is greater than or equal to the recommended exposure duration threshold, updating the first historical recommendation times of the first layout template;

acquiring the format utilization rate of the first format template according to the updated first historical recommendation times and the first historical selection times of the first format template;

and if the format utilization rate is smaller than a format retention threshold, deleting the first format template in the cold-start format pool.

18. The method of claim 1, wherein the displaying N layout pages in the data recommendation region in response to a picture selection operation for a target picture of the M candidate pictures comprises:

responding to picture selection operation aiming at a target picture in the M candidate pictures, and acquiring N layout templates;

acquiring picture display areas respectively included by the N layout templates, adding the target picture to the picture display areas in the N layout templates, and generating N layout pages;

and displaying the N layout pages in the data recommendation area.

19. The method of claim 1, wherein the displaying N layout pages in the data recommendation region in response to a picture selection operation for a target picture of the M candidate pictures comprises:

responding to picture selection operation aiming at a target picture in the M candidate pictures, and acquiring the display text quantity of text data displayed in a first display page of the media file and the display picture quantity of file pictures;

acquiring a target layout pool corresponding to the display text quantity and the display picture quantity, and acquiring N layout templates from the target layout pool;

writing the target picture, the text data and the file picture into the N layout templates to generate N layout pages;

and displaying the N layout pages in the data recommendation area.

20. A document processing apparatus, characterized in that the apparatus comprises:

the candidate picture display module is used for displaying text data in a first display page of the media file, and for the text data, displaying M candidate pictures in a data recommendation area; m is a positive integer;

a layout display module, configured to display N layout pages in the data recommendation region in response to a picture selection operation for a target picture in the M candidate pictures; each layout page includes the target picture and the text data displayed in a first display page of the media file; n is a positive integer;

and the format selection module is used for responding to page selection operation aiming at a target format page in the N format pages and switching and displaying the first display page to be the target format page.