WO2022201237A1 - Server, text field arrangement position method, and program - Google Patents

Server, text field arrangement position method, and program Download PDF

Info

Publication number
WO2022201237A1
WO2022201237A1 PCT/JP2021/011672 JP2021011672W WO2022201237A1 WO 2022201237 A1 WO2022201237 A1 WO 2022201237A1 JP 2021011672 W JP2021011672 W JP 2021011672W WO 2022201237 A1 WO2022201237 A1 WO 2022201237A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
saliency
placement position
text field
cut
Prior art date
Application number
PCT/JP2021/011672
Other languages
French (fr)
Japanese (ja)
Inventor
孝弘 坪野
イー カー ヤン
美帆 折坂
Original Assignee
株式会社オープンエイト
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社オープンエイト filed Critical 株式会社オープンエイト
Priority to PCT/JP2021/011672 priority Critical patent/WO2022201237A1/en
Publication of WO2022201237A1 publication Critical patent/WO2022201237A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • H04N5/91Television signal processing therefor

Definitions

  • the present invention relates to a server for generating video content to be distributed to user terminals, a text field layout position method, and a program.
  • Patent Document 1 proposes a moving image processing apparatus that efficiently searches for a desired scene image from a moving image having a plurality of chapters.
  • a moving image includes an object or area (hereinafter referred to as a "saliency area") with saliency that a viewer pays attention to, and a text sentence is superimposed on at least a part of the saliency area. Then, the visibility of the moving image is impaired. Furthermore, if the text sentences are not arranged in a direction and distance at which the line of sight can be naturally moved from the saliency area of the object to the text sentences, the visibility of the moving image is impaired, and the readability of the text sentences is also impaired.
  • the present invention provides a server or the like that makes it possible to easily create compound content data, and in particular, makes it possible to arrange text sentences in consideration of the arrangement positional relationship with saliency regions in a moving image. intended to
  • a material content data setting unit that sets image data for a cut, and a text field layout that determines the position of the text field to be placed on the cut by referring to the saliency region in the image data.
  • a position determining unit is provided.
  • a server or the like that makes it possible to easily create composite content data, and in particular, to arrange text sentences in consideration of the arrangement positional relationship with saliency regions in a moving image. It becomes possible to
  • FIG. 1 is a configuration diagram of a system according to an embodiment
  • FIG. 1 is a configuration diagram of a server according to an embodiment
  • FIG. 3 is a configuration diagram of a management terminal and a user terminal according to an embodiment
  • FIG. 1 is a functional block diagram of a system according to an embodiment
  • FIG. 4 is a diagram for explaining an example screen layout that constitutes a cut
  • 4 is a flow chart of a system according to an example embodiment
  • FIG. 10 is an explanatory diagram of an aspect of displaying a list of a plurality of cuts forming composite content data on a screen; It is a figure explaining the 2nd data field arrangement
  • FIG. 1 is a configuration diagram of a system according to an embodiment
  • FIG. 1 is a configuration diagram of a server according to an embodiment
  • FIG. 3 is a configuration diagram of a management terminal and a user terminal according to an embodiment
  • FIG. 1 is a functional block diagram of a system according to
  • FIG. 5 is a diagram illustrating saliency object detection according to an example embodiment
  • FIG. 3 shows an example original image for saliency-based detection
  • FIG. 11 illustrates an example of saliency object detection for the image of FIG. 10
  • FIG. 4 is a diagram illustrating saliency map detection according to an example embodiment
  • FIG. 13 illustrates an example of saliency map detection for the image of FIG. 12
  • FIG. 10 is a diagram showing an example in which a second data field placement recommendation region is shown to the upper right of large mountains that are salient objects in the image shown in FIG. 9
  • FIG. 15 is a diagram showing a state in which the second data field is arranged in a portion of the second data field arrangement recommended area in the example shown in FIG.
  • FIG. 12 is a diagram showing an example in which a second data field placement recommendation area is displayed on the right side of the animal in the image, which is the saliency object in the image shown in FIG. 11;
  • FIG. 17 is a diagram showing a state in which the second data field is arranged in a portion of the second data field arrangement recommended area in the example shown in FIG. 16 with a high score indicated by a low density;
  • Figure 11 shows an example of hybrid saliency map detection for the image of Figure 10;
  • FIG. 12 is a diagram showing an example in which a second data field placement recommendation area is displayed on the right side of the animal in the image, which is the saliency object in the image shown in FIG. 11;
  • FIG. 17 is a diagram showing a state in which the second data field is arranged in a portion of the second data field arrangement recommended area in the example shown in FIG. 16 with a high score indicated by a low density;
  • Figure 11 shows an example of hybrid saliency map detection for the image of Figure 10;
  • FIG. 10 is a diagram showing a modification in which the placement position specifying unit specifies the placement position of the second data field on the cut into which the material content data (image) is inserted; 20A and 20B show various placement examples for placing the second data field at the cell positions identified by the method shown in FIG. 19; FIG. FIG. 10 is a diagram showing another example of specifying the placement position of the second data field on the cut into which the material content data (image) is inserted, by the placement position specifying unit;
  • a server or the like has the following configuration.
  • [Item 1] a material content data setting unit for setting image data for a cut; a text field placement position determination unit that determines the position of the text field to be placed on the cut by referring to the saliency region in the image data;
  • a server with [Item 2] The text field placement position determination unit is a saliency region discriminating unit that discriminates a saliency region included in image data; a placement position specifying unit that specifies the placement position of the text field on the cut with reference to the saliency region;
  • a server according to item 1, comprising: [Item 3]
  • the placement position specifying unit determines the placement position of the text field on the cut based on a model generated by learning image data that satisfies predetermined conditions for the relationship between the saliency region in the image and the text field as training data.
  • the arrangement position identifying unit is configured to identify the arrangement position of the text field on the cut based on the scoring value calculated for each pixel of the image data.
  • the placement position specifying unit is configured to specify the placement position of the text field on the cut based on the scoring value calculated for each of the plurality of cells that separate the image data.
  • the placement position specifying unit dividing the entire image of the cut in which the image data is set into a plurality of cells; excluding cells that include at least a portion of the saliency region among the plurality of cells; Identifying a cell that satisfies a predetermined condition regarding a relationship with the saliency region among the plurality of cells remaining after exclusion as the placement position of the text field; is configured to run The server of item 2.
  • the saliency regions are detected by hybrid saliency map detection using saliency object detection and saliency map detection; 7. The server according to any one of items 1-6.
  • the saliency regions are detected by saliency map detection, 7. The server according to any one of items 1-6.
  • the saliency regions are detected by saliency object detection, 7.
  • the server according to any one of items 1-6.
  • a system comprising a server according to any one of items 1-9.
  • a computer-implemented text field placement method comprising: setting data for the cut; determining the position of the text field to be placed on the cut based on the saliency regions in the image data; Text field placement position method, including.
  • this system A system for creating composite content data (hereinafter referred to as "this system") and the like according to an embodiment of the present invention will now be described.
  • this system A system for creating composite content data (hereinafter referred to as "this system") and the like according to an embodiment of the present invention.
  • this system A system for creating composite content data (hereinafter referred to as "this system") and the like according to an embodiment of the present invention.
  • this system A system for creating composite content data (hereinafter referred to as "this system") and the like according to an embodiment of the present invention.
  • this system A system for creating composite content data (hereinafter referred to as "this system") and the like according to an embodiment of the present invention will now be described.
  • this system the same or similar elements are denoted by the same or similar reference numerals and names, and duplicate descriptions of the same or similar elements may be omitted in the description of each embodiment.
  • the features shown in each embodiment can be applied to other embodiments as long as they are not mutually contradictory.
  • the system according to the embodiment includes a server 1, an administrator terminal 2, and a user terminal 3.
  • FIG. A server 1, an administrator terminal 2, and a user terminal 3 are connected via a network N so as to be able to communicate with each other.
  • Network N may be a local network, or may be connectable to an external network.
  • the server 1 is composed of one unit is described, but it is also possible to realize the server 1 using a plurality of server devices.
  • the server 1 and the administrator terminal 2 may be shared.
  • FIG. 2 is a diagram showing the hardware configuration of the server 1 shown in FIG. 1. As shown in FIG. Note that the illustrated configuration is an example, and other configurations may be employed. Also, the server 1 may be a general-purpose computer such as a workstation or a personal computer, or may be logically realized by cloud computing.
  • the server 1 includes at least a processor 10 , a memory 11 , a storage 12 , a transmission/reception section 13 , an input/output section 14 and the like, which are electrically connected to each other through a bus 15 .
  • the processor 10 is an arithmetic device that controls the overall operation of the server 1, controls transmission and reception of data between elements, executes applications, and performs information processing necessary for authentication processing.
  • the processor 10 includes a CPU (Central Processing Unit) and a GPU (Graphics Processing Unit), and executes programs for this system stored in the storage 12 and developed in the memory 11 to perform each information process.
  • the processing capability of the processor 10 only needs to be sufficient for executing necessary information processing, so for example, the processor 10 may be composed only of a CPU, and is not limited to this.
  • the memory 11 includes a main memory composed of a volatile memory device such as a DRAM (Dynamic Random Access Memory), and an auxiliary memory composed of a non-volatile memory device such as a flash memory or a HDD (Hard Disc Drive). .
  • the memory 11 is used as a work area or the like for the processor 10, and may store a BIOS (Basic Input/Output System) executed when the server 1 is started, various setting information, and the like.
  • BIOS Basic Input/Output System
  • the storage 12 stores various programs such as application programs.
  • a database storing data used for each process may be constructed in the storage 12 .
  • the storage 12 stores a computer program for causing the server 1 to execute the composite content data creation method described with reference to FIG.
  • a computer program is stored for causing the server 1 to execute the second data arrangement position determination method described below.
  • the transmission/reception unit 13 connects the server 1 to the network.
  • the input/output unit 14 is an information input device such as a keyboard and mouse, and an output device such as a display.
  • a bus 15 is commonly connected to the above elements and transmits, for example, address signals, data signals and various control signals.
  • the administrator terminal 2 and the user terminal 3 shown in FIG. 3 also include a processor 20, a memory 21, a storage 22, a transmission/reception section 23, an input/output section 24, etc. These are electrically connected to each other through a bus 25. . Since the function of each element can be configured in the same manner as the server 1 described above, detailed description of each element will be omitted.
  • the administrator uses the administrator terminal 2 to, for example, change the settings of the server 1 and manage the operation of the database.
  • a user can access the server 1 from the user terminal 3 to create or view composite content data, for example.
  • FIG. 4 is a block diagram illustrating functions implemented in the server 1.
  • the server 1 includes a communication section 110 , a storage section 160 and a material content data setting section 190 .
  • Material content data setting unit 190 includes identified information analysis unit 120, second data generation unit 130, composite content data generation unit 140, association unit 150, classifier 170, and second data placement position determination unit 180.
  • Composite content data generator 140 includes base data generator 142 , second data allocation unit 144 , and material content data allocation unit 146 .
  • the storage unit 160 is composed of storage areas such as the memory 11 and the storage 11, and includes a base data storage unit 161, a material content data storage unit 163, a compound content data storage unit 165, an interface information storage unit 167, and other various storage areas. Contains databases.
  • a second data placement position determination unit 180 includes a saliency determination unit 182 that determines a saliency region related to an object or range having salience in the material content data, and a placement data field that specifies the position where the second data field is to be placed. and a locator 184 .
  • the functions of the units 120, 130, 140, 150, 170, and 180 that make up the material content data setting unit 190 can be implemented by one or more processors 10, for example.
  • the communication unit 110 communicates with the administrator terminal 2 and the user terminal 3.
  • the communication unit 110 also functions as a reception unit that receives first data including information to be identified, for example, from the user terminal 3 .
  • the first data is, for example, text data such as articles containing information to be identified (for example, press releases, news, etc.), image data containing information to be identified (for example, photographs, illustrations, etc.), or video data. , voice data including information to be identified, and the like.
  • the text data here is not limited to text data at the time of transmission to the server 1, but may be text data generated by a known voice recognition technique from voice data transmitted to the server 1, for example.
  • the first data may be text data such as articles, etc., summarized by existing automatic summarization technology such as extractive summary or generative summary (including information to be identified).
  • extractive summary or generative summary (including information to be identified).
  • generative summary including information to be identified.
  • the audio data referred to here is not limited to audio data acquired by an input device such as a microphone, but may be audio data extracted from video data or audio data generated from text data.
  • audio data such as narration and lines are extracted from temporary images such as rough sketches and temporary moving images such as temporary video, and composite content is extracted along with material content data based on the audio data as will be described later.
  • Data may be generated.
  • voice data may be created from text data with a story, and in the case of fairy tales, for example, a picture-story show or moving image based on the read-out story and material content data may be generated as composite content data.
  • the second data generation unit 130 determines that it is not necessary to divide the first data (for example, the text data is a short sentence with a preset number of characters or less), the second data generation unit 130 The data generator 130 generates the first data as it is as the second data.
  • the second data generation unit 130 divides the first data.
  • the data is divided and generated as second data each including at least part of the information to be identified of the first data.
  • division number information of the second data is also generated. Any known technique may be used for the method of dividing the first data by the second data generation unit 130. For example, if the first data can be converted into text, Based on the analysis results of the maximum number of characters in each cut of the base data and the modification relationship between clauses, sentences may be separated so that a natural section as a sentence fits into each cut.
  • the identified information analysis unit 120 analyzes the second data described above and acquires identified information.
  • the information to be identified may be any information as long as it can be analyzed by the information to be identified analysis unit 120 .
  • the identified information may be in word form defined by a language model. More specifically, it may be one or more words (for example, "Shibuya, Shinjuku, Roppongi” or "Shibuya, Landmark, Teen”) accompanied by a word vector, which will be described later.
  • the words may include words that are not usually used alone, such as "n", depending on the language model.
  • a feature vector extracted from a document, an image, or a moving image may be used instead of the above-described word format.
  • the composite content data generation unit 140 generates base data including the number of cuts (one or more cuts) according to the division number information of the second data generated by the second data generation unit 130 described above. and the material content data newly input from the user terminal 3 and/or the material content data stored in the material content data storage unit 163 and the base data in which the above-described second data is assigned to each cut are combined.
  • the composite content data is generated as content data, stored in the composite content data storage unit 165 , and displayed on the user terminal 3 .
  • the base data generation unit 142 assigns numbers to the generated one or more cuts, such as scene 1, scene 2, scene 3 or cut 1, cut 2, cut 3, for example.
  • Fig. 5 is an example of a screen layout of cuts that make up the base data.
  • the edited second data for example, delimited text sentences
  • the second data field 31 which is a text data field
  • the material content data field The material content data selected in 32 is inserted.
  • the second data field 31 and the material content data field 32 are separated.
  • a second data field 31 may be inserted to overlay the field 32 .
  • the second data field 31 must be arranged so as not to overlap the saliency region in the material content data.
  • the preset maximum number of characters in the case of text data
  • screen layout in the case of text
  • playback time in the case of video
  • composite content data does not necessarily need to be stored in the composite content data storage unit 165, and may be stored at appropriate timing.
  • the base data to which only the second data is assigned may be displayed on the user terminal 3 as progress information of the composite content data.
  • the second data allocation unit 144 sequentially allocates the second data in the order of numbers assigned to one or more cuts generated by the base data generation unit 142 described above.
  • the association unit 150 compares at least part of the information to be identified included in the second data described above with, for example, extracted information extracted from the material content data (for example, class labels extracted by the classifier), For example, mutual similarity or the like is determined, and material content data suitable for the second data (for example, data having a high degree of similarity) and the second data are associated with each other.
  • material content data A for example, an image of a woman
  • identified information included in the second data represents "teacher” and extracted information is "face” and "mountain”.
  • is prepared for example, an image of Mt.
  • the relationship between the word vector obtained from “teacher” and the word vector obtained from “face” is the word vector obtained from "teacher” and
  • the second data is associated with the material content data A because it is more similar than the association of word vectors obtained from "mountain”.
  • the extraction information of the material content data may be extracted in advance by the user and stored in the material content data storage unit 163, or may be extracted by the classifier 170, which will be described later.
  • the similarity determination may be performed by preparing a trained model that has learned word vectors, and using the vectors to determine the similarity of words by a method such as cosine similarity or Word Mover's Distance.
  • Material content data can be, for example, image data, video data, sound data (eg, music data, voice data, sound effects, etc.), but is not limited to this.
  • the material content data may be stored in advance in the material content data storage unit 163 by the user or administrator, or may be acquired from the network and stored in the material content data storage unit 163. may be
  • the material content data allocation unit 146 allocates suitable material content data to cuts to which the corresponding second data is allocated, based on the above-described association.
  • the interface information storage unit 167 stores various control information to be displayed on the display unit (display, etc.) of the administrator terminal 2 or the user terminal 3.
  • the classifier 170 acquires learning data from a learning data storage unit (not shown) and performs machine learning to create a trained model. Creation of the classifier 170 occurs periodically.
  • the learning data for creating a classifier may be data collected from the network or data owned by the user with class labels attached, or a data set with class labels may be procured and used. .
  • the classifier 170 is, for example, a trained model using a convolutional neural network, and upon input of material content data, extracts one or a plurality of extracted information (eg, class labels, etc.).
  • the classifier 170 for example, extracts class labels representing objects associated with the material content data (eg, seafood, grilled meat, people, furniture).
  • FIG. 6 is a diagram explaining an example of the flow of creating composite content data.
  • the server 1 receives first data including at least identification information from the user terminal 3 via the communication unit 110 (step S101).
  • the identified information is, for example, one or more words
  • the first data may be, for example, text data consisting of an article containing one or more words or a summary of the text data.
  • the server 1 acquires identified information by analyzing the first data by the identified information analysis unit 120, and generates one or more data containing at least part of the identified information by the second data generation unit 130. Second data and division number information are generated (step S102).
  • the server 1 causes the base data generation section 142 to generate the base data including the number of cuts according to the division number information by the composite content data generation section 140 (step S103).
  • the server 1 allocates the second data to the cut by the second data allocation unit (step S104).
  • the base data in this state may be displayed on the user terminal 3 so that the progress can be checked.
  • the server 1 causes the association unit 150 to extract the material content data in the material content data storage unit 163. and the second data (step S105), and the material content data allocation unit 146 allocates the material content data to the cut (step S106).
  • the server 1 uses the second data placement position determining unit 180 to determine each cut based on the saliency region related to the object/range having salience in the image of the material content data, which is detected from the material content data.
  • the placement position of the second data field 31 to be placed above is determined (step 107).
  • the server 1 generates the base data to which the second data and the material content data are assigned as composite content data, stores the composite content data in the composite content data storage unit 165, and displays the composite content data on the user terminal 3 (step S108).
  • the arrangement position of the second data field 31 in each cut is determined by the second data arrangement position determination section 180 as described above, and the server 1 determines the arrangement determined by the second data arrangement position determination section 180.
  • a second data field 31 is inserted into each cut according to its position.
  • a list of a plurality of cuts forming the composite content data can be displayed on the screen. For each cut, along with the displayed material content data and second data, information on the playback time (in seconds) of each cut may also be displayed.
  • the user can, for example, correct the content by clicking the second data field 31 or the corresponding button, and replace the material content data by clicking the material content data field 32 or the corresponding button. can be done. Furthermore, it is also possible for the user to add other material content data to each scene from the user terminal.
  • step S102 for reading the base data may be executed as long as it has been read before the assignment of the second data or material content data.
  • step S104 for assigning the second data may be executed as long as it has been read before the assignment of the second data or material content data.
  • step S105 for association for assigning material content data
  • step S106 for assigning material content data may be determined of the arrangement position of the second data field 31
  • the order of the steps may be executed in any order as long as there is no discrepancy with each other.
  • the material content data setting unit 190 using the identified information analysis unit 120, the association unit 150, the classifier 170, and the second data placement position determination unit 180 described above is one setting of the composite content data creation system. It may be a function, and the setting method by the material content data setting unit 190 is not limited to this.
  • the base data is generated by the base data generation unit 142 in the above example, but it may be read from the base data storage unit 161 instead.
  • the read-out base data may include, for example, a predetermined number of blank cuts, or template data in which predetermined material content data, format information, etc. have been set for each cut (for example, music data, background data, etc.). image, font information, etc.) may be used.
  • the user may be able to set any material content to all or part of each data field from the user terminal.
  • a setting method may be combined with a user operation, such as a user inputting arbitrary text using a user terminal, extracting information to be identified from these texts as described above, and associating material content.
  • FIG. 1 A second data field placement location determination method is performed in step 107 above.
  • FIG. 8 is a diagram explaining an example of the flow of the second data field arrangement position determination method.
  • the saliency region determining unit 182 determines a saliency region related to the material content data (image data) set for each cut (S201). ), and a step of specifying the placement position of the second data field 31 on each cut by referring to the saliency region of the image by the placement position specifying unit 184 (S202).
  • the saliency region determination unit 182 uses a saliency determination model, which is a trained model regarding salience, obtained by a known learning method such as the saliency object detection shown in FIG. 9 or the saliency map detection shown in FIG. are used to discriminate objects and ranges (hereinafter also referred to as “saliency regions”) having saliency in the image.
  • the saliency determination model is stored in the model storage unit 169 of the storage unit 160, for example.
  • the saliency region determination unit 182 determines saliency regions in images based on saliency information as exemplified in the images shown in FIGS. 9 and 11 to 13 .
  • FIG. 9 shows an example of detecting a saliency object in an image using a saliency object detection model.
  • Saliency object detection using a saliency object detection model can be realized using a known technique such as an encoder-decoder model.
  • large and small mountains surrounded by dotted lines in FIG. 9 are detected as saliency objects in the image, and the saliency region discriminating unit 182 discriminates these large and small mountains as saliency objects.
  • FIG. 10 and 11 show another example of detecting objects with salience in an image using the saliency object detection model.
  • saliency object detection is performed on an image including animals shown in FIG. 10 using a saliency object detection model, a relatively bright region in FIG. The contour shape of the animal as shown is detected, and the saliency region discriminating section 182 discriminates the region showing this contour shape as a saliency object.
  • FIG. 12 shows an example of detecting a saliency range in an image using a saliency map detection model.
  • Saliency range detection using a saliency map detection model can be realized using a known technique such as applying a trained model to a feature map generated using a convolutional neural network based on an input image. can.
  • a saliency map is generated by determining the visual saliency strength of each pixel in the image.
  • the darkness of the black portion expresses the strength of visual salience.
  • the saliency region discriminating unit 182 discriminates a range occupied by a large proportion of pixels with relatively strong visual saliency (in the example shown in FIG. 12, a region of large mountains among large and small mountains) as a saliency range.
  • FIG. 13 shows another example of detecting a saliency range in an image using a saliency map detection model.
  • saliency range detection is performed using the saliency map detection model for the image including animals shown in FIG.
  • the range showing the outer shape of the animal as shown by the relatively bright region in 13 is discriminated as the saliency range.
  • strong visual saliency is detected in the portion corresponding to the animal's face, indicating that the saliency of that portion is particularly high.
  • the placement position specifying unit 184 uses a placement position determination model, which is a learned model obtained by machine learning the relationship between the placement positions of the saliency regions in the image and the second data field 31, to determine the material content data.
  • the arrangement position of the second data field 31 on the cut in which the (image) is inserted is specified.
  • the arrangement position determination model is also stored, for example, in the model storage unit 169 of the storage unit 160 in the same manner as the saliency determination model.
  • the arrangement position determination model is, for example, an image to which text is added, and an image that is recognized to have a good relationship between the saliency region in the image and the arrangement position of the text as training data, and an arbitrary learner can be generated by machine learning using A learner that generates an arrangement position determination model extracts saliency regions and text from images of training data and learns their relative positional relationships in the images. Images used as training data are preferably selected based on, for example, the following points of view as conditions for the relationship between the saliency regions in the image and the placement positions of the text.
  • the relationship between the saliency regions in the image and the placement position of the text is taken into consideration (the text is placed in a direction and distance that allows natural movement of the line of sight from the saliency region of the object to the text). • All or part of the text does not overlap the saliency region. - The text is placed near (or away from) a portion of the saliency region that has particularly high salience.
  • the layout position specifying unit 184 calculates a recommended area for arranging the second data field 31 for the image data set for each cut using the layout position determination model described above. The above calculation may be performed, for example, by scoring the degree of recommendation of the placement position of the second data field 31 for each pixel of the image.
  • 14 and 16 are diagrams showing the second recommended data field placement areas calculated by the placement position specifying unit 184.
  • FIG. FIG. 14 shows an example in which the second data field placement area is shown to the upper right of the large mountains that are the salient objects in the image shown in FIG. 9. In the example shown in FIG. A high score is given to a portion of the data field placement recommendation area that is shown in high density. Also, FIG.
  • 16 shows an example in which the second data field placement recommendation area is displayed on the right side of the animal in the image, which is the saliency object in the image shown in FIG. , the portion of the second data field placement recommendation area shown in low density has a high score. 14 and 16 show the second recommended data field placement area specified by the placement position specifying unit 184 for explanation and visualization. Display of the data field placement recommendation area of 2 is not essential.
  • the placement position specifying unit 184 specifies, as the placement position of the second data field 31, a portion with a higher score among the second data field placement regions obtained as described above.
  • FIG. 15 shows a state in which the second data field 31 is arranged in a high-scoring part indicated by high density in the second data field arrangement area in the example shown in FIG. 16 shows a state in which the second data field 31 is arranged in a portion of the second data field arrangement area in the example shown in FIG.
  • the saliency region in the image is determined based on the saliency information, and the arrangement position of the second data field 31 is positioned at the same position as the saliency region. Since the relationship is specified in consideration of the relationship, the server 1 easily creates composite content data in which the second data field 31 is arranged at an appropriate position with respect to the saliency region in the image on each cut. becomes possible.
  • the placement position specifying unit 184 of the second data placement position determining unit 180 specifies the placement position of the second data field 31 using the placement position determination model.
  • the position where a part of the second data field 31 overlaps the saliency area in the image may be specified as the arrangement position of the second data field 31. have a nature.
  • the second data placement position determining unit 180 determines whether the specified placement position of the second data field 31 is a position where a part of the second data field 31 overlaps the saliency region in the image, When it is determined that the second data field 31 overlaps the saliency region, it is preferable that the position shifted to the position where the second data field 31 does not overlap the saliency region is specified again as the placement position of the second data field 31 .
  • the second data placement position determination unit 180 may, for example, place the second data A position where the overlap can be eliminated with the shortest displacement distance from the first specified placement position of the field 31 is specified, and the placement position of the second data field 31 is corrected to that position.
  • the second data placement position determining unit 180 may specify a position away from the more salient portion of the saliency object as the placement position of the second data field 31 .
  • the saliency map detection is used as described with reference to FIG. 13, the overall image of the detected saliency object is unclear.
  • the second data field 31 is actually arranged at the position where the second data field 31 is arranged, a part of the second data field 31 may overlap the saliency area.
  • the placement position specifying unit 184 of the second data placement position determining unit 180 can place the second data field 31 at a more appropriate position with respect to the detected saliency region.
  • the degree of recommendation of the arrangement position of the second data field 31 is scored for each pixel of the image.
  • a scoring value is obtained for each pixel, detailed information about the degree of recommendation of the arrangement position of the second data field 31 can be obtained.
  • the scoring value assigned to each pixel out of a large number of pixels is extremely small, the placement recommendation region of the second data field 31 specified by their distribution is uniformed and relatively small. It can be a large area.
  • this modification when the placement position specifying unit 184 calculates the recommended area in which the second data field 31 is placed for the image data set for each cut using the above placement position determination model, , divide the target image into a specific number of cells in advance, score the degree of recommendation of the placement position of the second data field 31 for each cell, and place the position of the cell with the highest scoring value in the second data field 31 Placement position.
  • functions and operations of other configurations of the server 1 that implements this modification are as described with reference to FIGS.
  • FIG. 19 is a diagram showing a modification in which the placement position specifying unit 184 specifies the placement position of the second data field 31 on the cut into which the material content data (image) is inserted.
  • the placement position specifying unit 184 divides the entire image on the cut in which the material content data (image) is inserted into a plurality of cells, and uses the placement position determination model for each cell to obtain the second data. A scoring value relating to the degree of recommendation of the placement position of the field 31 is calculated, and the cell with the highest scoring value is specified as the cell in which the second data field 31 should be placed.
  • FIG. 19 shows an example in which the entire image is partitioned into 18 cells of 3 vertically and 6 horizontally, the number of cells partitioning the entire image may be set arbitrarily. Also, in FIG. 19, the boundary lines separating the cells are indicated by dashed lines, but this is shown for the sake of explanation, and such boundary lines are not actually drawn in the processing by the placement position specifying unit 184. .
  • the placement position specifying unit 184 divides the entire cut image into which the material content data (image) is inserted into a predetermined number of cells (see FIG. 19(a)).
  • a predetermined number of cells see FIG. 19(a)
  • an example is shown in which an image is divided into 18 cells of 3 vertically and 6 horizontally, but the number of cells dividing the entire image can be set arbitrarily.
  • the layout position specifying unit 184 calculates a recommended area for arranging the second data field 31 for the image data set for each cut using the layout position determination model (FIG. 19B). reference).
  • the above calculation is performed by scoring the degree of recommendation of the placement position of the second data field 31 for each cell of the image divided as shown in FIG. 19(a).
  • FIG. 19(b) shows the scoring values calculated for each cell. Although the scoring values of each cell are shown in FIG. 19(b) for explanation and visualization, the display of these scoring values is not essential in the actual processing by the arrangement position specifying unit 184.
  • FIG. 19B layout position determination model
  • the placement position specifying unit 184 specifies the cell with the highest calculated scoring value among these cells as the placement position of the second data field 31 (see FIG. 19(c)).
  • the cell in the second column from the right in the top row has the highest scoring value of 0.34, so that cell is identified as the placement position of the second data field 31. be done.
  • the cells specified as the placement position of the second data field 31 are shaded for explanation and visualization. does not generate or display shading.
  • the entire cut image is divided into a predetermined number of cells, and the placement position specifying unit 184 calculates the scoring value regarding the degree of recommendation of the placement position of the second data field 31 for each cell.
  • the cell with the highest scoring value among those cells is specified as the arrangement position of the second data field 31 .
  • the number of targets for which the scoring value is calculated is small. The difference in scoring value between cells becomes clear, and as a result, it becomes easier to uniquely identify the cell with the highest scoring value as the optimal position as the placement position of the second data field 31 .
  • FIG. 20 is a diagram showing various placement examples for placing the second data field 31 at the position of the cell specified by the method shown in FIG.
  • the second data field 31 is arranged in the specified cell based on the settings made in advance in the placement position specifying section 184 .
  • FIG. 20(a) shows an example in which the placement position specifying unit 184 is set so as to place the second data field 31 in the center of the specified cell.
  • the placement position specifying unit 184 places the second data field 31 in the center of the specified cell so that the center of the specified cell and the second data field 31 approximately match.
  • the left and right portions of the second data field 31 extend into cells adjacent to each other on the left and right.
  • FIG. 20(b) shows, for the identified cell, the placement position so that the second data field 31 is placed as far away as possible from the saliency region in the image (large mountains in the example shown in FIG. 20).
  • An example in which the specifying part 184 is set is shown.
  • the placement location identifier 184 causes the second data field 31 to extend along the upper edge of the identified cell and the right portion of the second data field 31 into the adjacent cell to the right.
  • a second data field 31 is placed in the identified cell as follows.
  • the placement position specifying unit 184 is set so as to place the second data field 31 at a position as close as possible to the saliency region (large mountains) in the image for the specified cell.
  • placement location identifier 184 extends second data field 31 along the bottom edge of the identified cell and the right portion of second data field 31 extends into the adjacent cell to the left.
  • a second data field 31 is placed in the identified cell as follows.
  • the above-described various settings of the placement position specifying unit 184 can be appropriately changed in the server 1 according to user input received from the user terminal 3 via the communication unit 110, for example.
  • the second data field 31 placed in the specified cell protrudes outside the area of the entire image of the cut.
  • the second data field 31 placed as described with reference to 20(a)-(c) does not meet these constraints, so in this example the second data placement position
  • the determining unit 180 determines whether the arranged second data field 31 protrudes outside the area of the entire image of the cut or overlaps the saliency area, and determines that at least one of them applies.
  • the position shifted to the position where the second data field 31 satisfies the above constraints is shifted to the first position.
  • 2 data field 31 is preferably configured to be specified again.
  • the placement position identifying unit 184 determines the placement position determination model, which is a learned model obtained by machine learning the relationship between the placement positions of the saliency regions in the image and the second data field 31.
  • the placement position determination model is a learned model obtained by machine learning the relationship between the placement positions of the saliency regions in the image and the second data field 31.
  • An example of specifying the arrangement position of the second data field 31 on the cut into which the material content data (image) is inserted has been described.
  • the second data field 31 can be arranged at an appropriate position considering the positional relationship between the saliency region and the second data field 31 .
  • scoring for each pixel in image data requires a particularly large computational cost.
  • FIG. 21 is a diagram showing another example of specifying the placement position of the second data field 31 on the cut into which the material content data (image) is inserted by the placement position specifying unit 184. As shown in FIG.
  • the placement position specifying unit 184 divides the entire cut image into which the material content data (image) is inserted into a plurality of cells, and specifies the optimum cell in which the second data field 31 should be placed.
  • FIG. 21 shows an example in which the entire image is partitioned into 9 cells, 3 vertically and 3 horizontally, the number of cells partitioning the entire image may be set arbitrarily.
  • the boundary lines separating the cells are indicated by dashed lines, but this is shown for the sake of explanation, and such boundary lines are not actually drawn in the processing by the placement position specifying unit 184. . Note that the functions and operations of other components of the server 1 that implements this embodiment are as described with reference to FIGS.
  • the placement position specifying unit 184 divides the entire cut image into which the material content data (image) is inserted into a predetermined number of cells (see FIG. 21(a)).
  • the placement position specifying unit 184 excludes cells including at least part of the saliency regions specified by the saliency region discrimination unit 182 from among those cells from the placement position object of the second data field 31. (See FIG. 21(b)).
  • the cells excluded from the placement position target are indicated by X marks for explanation and visualization, but the X marks are not generated or displayed in the actual processing by the placement position specifying unit 184.
  • the placement position specifying unit 184 specifies the cell closest to the saliency region in the image, among the remaining cells that have not been excluded from the placement position target, as the placement position of the second data field 31 (Fig. 21(c)).
  • the cells specified as the placement position of the second data field 31 are shaded for explanation and visualization. No odds are generated or displayed.
  • the most suitable position under a predetermined condition is selected from a plurality of options for the placement position of the second data field 31 generated by dividing the entire cut image into a predetermined number of cells.
  • the determined cell is identified as the placement position of the second data field 31 .
  • the appropriateness of the placement position for the saliency region may be somewhat reduced, it is less It has the advantage of being computationally costly.
  • the condition for selecting the placement position of the second data field 31 is "closest to the saliency region", but the selection condition for the placement position is not limited to this.
  • the condition may be a position at a predetermined distance from the saliency region, a position farthest from the saliency region, a position in a predetermined direction from the saliency region, or the like. may be combined arbitrarily as long as no

Abstract

[Problem] To provide a server, etc., that makes it possible to easily prepare composite content data, particularly to arrange a text sentence in consideration of the arrangement position relationship with respect to a salient region in a moving image. [Solution] According to one embodiment of the present invention, there is provided a server 1 comprising: a material content data setting unit 190 that sets image data to a cut; and a text field arrangement position determination unit 180 that determines the position of a text field (second data field) arranged at a cut, the determination being made with reference to a salient region in the image data.

Description

サーバおよびテキストフィールド配置位置方法、プログラムServer and text field placement position method, program
 本発明は、ユーザ端末に配信する動画コンテンツを生成するサーバおよびテキストフィールド配置位置方法、プログラムに関する。 The present invention relates to a server for generating video content to be distributed to user terminals, a text field layout position method, and a program.
 従来から、動画等コンテンツデータ作成が行われており、例えば、特許文献1には、複数のチャプタを有する動画から所望のシーン画像を効率的に検索する動画処理装置が提案されている。  Conventionally, content data such as moving images has been created. For example, Patent Document 1 proposes a moving image processing apparatus that efficiently searches for a desired scene image from a moving image having a plurality of chapters.
特開2011-130007号公報Japanese Patent Application Laid-Open No. 2011-130007
 動画等コンテンツデータを作成することには多大な手間がかかり、特に、テキストデータや動画(画像を含む。以下同じ。)、音データなどの複数の素材コンテンツデータが用いられた複合コンテンツデータを作成する場合には、動画等コンテンツデータ作成者であるユーザの技術レベルによってはそれらの最適な組み合わせを考慮することが難しいため、簡便に複合コンテンツデータを作成することができるシステムの提供が求められていた。 It takes a lot of time and effort to create content data such as moving images. However, depending on the technical level of the user who creates content data such as moving pictures, it is difficult to consider the optimum combination of them. rice field.
 特に、テキストデータを動画データに組み合わせる際には、画像コンテンツ上の適切な位置を選択して、テキストデータから生成されるテキスト文章(字幕、キャプション等を含む)を配置する必要がある。一般に、動画中には視聴者が注目する顕著性のあるオブジェクトや領域(以下、「顕著性領域」という。)が含まれており、テキスト文章が顕著性領域の少なくとも一部に重ねて配置されると、その動画の視認性を損ねる。さらには、オブジェクトの顕著性領域からテキスト文章への自然な視線の移動が行える方向及び距離にテキスト文章が配置されていないと、動画の視認性を損なうと共に、テキスト文章の判読性も損なう。そのため、ユーザが手動操作でテキスト文章を動画中に配置する場合には、作成する複合コンテンツ全体の中の各動画について、配置するテキスト文章が動画中の顕著性領域に重ならないように注意を払うと共に、顕著性領域とテキスト文章との配置位置の関係性に注意を払う必要がある。 In particular, when combining text data with video data, it is necessary to select an appropriate position on the image content and place text sentences (including subtitles, captions, etc.) generated from the text data. In general, a moving image includes an object or area (hereinafter referred to as a "saliency area") with saliency that a viewer pays attention to, and a text sentence is superimposed on at least a part of the saliency area. Then, the visibility of the moving image is impaired. Furthermore, if the text sentences are not arranged in a direction and distance at which the line of sight can be naturally moved from the saliency area of the object to the text sentences, the visibility of the moving image is impaired, and the readability of the text sentences is also impaired. Therefore, when the user manually places the text sentences in the video, for each video in the entire composite content to be created, pay attention so that the text sentences to be placed do not overlap the saliency regions in the video. At the same time, it is necessary to pay attention to the positional relationship between the saliency regions and the text sentences.
 そこで、本発明は、複合コンテンツデータを簡便に作成することを可能とする、特にテキスト文章を動画中の顕著性領域との配置位置関係を考慮して配置することを可能とするサーバ等を提供することを目的とする。 Therefore, the present invention provides a server or the like that makes it possible to easily create compound content data, and in particular, makes it possible to arrange text sentences in consideration of the arrangement positional relationship with saliency regions in a moving image. intended to
 本発明の一態様によれば、カットに対して画像データを設定する素材コンテンツデータ設定部と、カット上に配置するテキストフィールドの位置を画像データにおける顕著性領域を参照して決定するテキストフィールド配置位置決定部と、を備えるサーバが提供される。 According to one aspect of the present invention, a material content data setting unit that sets image data for a cut, and a text field layout that determines the position of the text field to be placed on the cut by referring to the saliency region in the image data. A position determining unit is provided.
 本発明の他の特徴事項および利点は、例示的且つ非網羅的に与えられている以下の説明及び添付図面から理解することができる。 Other features and advantages of the present invention can be understood from the following description and accompanying drawings, given by way of example and non-exhaustively.
 本発明によれば、複合コンテンツデータを簡便に作成することを可能とする、特にテキスト文章を動画中の顕著性領域との配置位置関係を考慮して配置することを可能とするサーバ等を提供することが可能となる。 According to the present invention, there is provided a server or the like that makes it possible to easily create composite content data, and in particular, to arrange text sentences in consideration of the arrangement positional relationship with saliency regions in a moving image. It becomes possible to
実施形態例に係るシステムの構成図である。1 is a configuration diagram of a system according to an embodiment; FIG. 実施形態例に係るサーバの構成図である。1 is a configuration diagram of a server according to an embodiment; FIG. 実施形態例に係る管理端末、ユーザ端末の構成図である。3 is a configuration diagram of a management terminal and a user terminal according to an embodiment; FIG. 実施形態例に係るシステムの機能ブロック図である。1 is a functional block diagram of a system according to an embodiment; FIG. カットを構成する画面レイアウト例を説明する図である。FIG. 4 is a diagram for explaining an example screen layout that constitutes a cut; 実施形態例に係るシステムのフローチャートである。4 is a flow chart of a system according to an example embodiment; 複合コンテンツデータを構成する複数のカットを画面上に一覧表示する態様の説明図である。FIG. 10 is an explanatory diagram of an aspect of displaying a list of a plurality of cuts forming composite content data on a screen; 実施形態例に係る第2データフィールド配置位置特定方法を説明する図である。It is a figure explaining the 2nd data field arrangement|positioning identification method based on an example of embodiment. 実施形態例に係る顕著性物体検出を説明する図である。FIG. 5 is a diagram illustrating saliency object detection according to an example embodiment; 顕著性に基づく検出のための元画像例を示す図である。FIG. 3 shows an example original image for saliency-based detection; 図10の画像に対する顕著性物体検出の一例を示す図である。FIG. 11 illustrates an example of saliency object detection for the image of FIG. 10; 実施形態例に係る顕著性マップ検出を説明する図である。FIG. 4 is a diagram illustrating saliency map detection according to an example embodiment; 図12の画像に対する顕著性マップ検出の一例を示す図である。FIG. 13 illustrates an example of saliency map detection for the image of FIG. 12; 図9に示した画像中の顕著性物体である大きな山々の右上に第2のデータフィールド配置推薦領域が示現している例を示す図である。FIG. 10 is a diagram showing an example in which a second data field placement recommendation region is shown to the upper right of large mountains that are salient objects in the image shown in FIG. 9; 図14に示す例における第2のデータフィールド配置推薦領域のうち高い濃度で示されたスコアリングが高い部分に第2のデータフィールドが配置された状態を示す図である。FIG. 15 is a diagram showing a state in which the second data field is arranged in a portion of the second data field arrangement recommended area in the example shown in FIG. 14 and having a high score indicated by a high density; 図11に示した画像中の顕著性物体である画像中の動物の右側に第2のデータフィールド配置推薦領域が示現している例を示す図である。FIG. 12 is a diagram showing an example in which a second data field placement recommendation area is displayed on the right side of the animal in the image, which is the saliency object in the image shown in FIG. 11; 図16に示す例における第2のデータフィールド配置推薦領域のうち低い濃度で示されたスコアリングが高い部分に第2のデータフィールドが配置された状態を示す図である。FIG. 17 is a diagram showing a state in which the second data field is arranged in a portion of the second data field arrangement recommended area in the example shown in FIG. 16 with a high score indicated by a low density; 図10の画像に対するハイブリッド顕著性マップ検出の一例を示す図である。Figure 11 shows an example of hybrid saliency map detection for the image of Figure 10; 配置位置特定部による、素材コンテンツデータ(画像)が挿入されたカット上における第2のデータフィールの配置位置を特定する変形例を示す図である。FIG. 10 is a diagram showing a modification in which the placement position specifying unit specifies the placement position of the second data field on the cut into which the material content data (image) is inserted; 図19に示す方法によって特定されたセルの位置に第2のデータフィールドを配置する種々の配置例を示す図である。20A and 20B show various placement examples for placing the second data field at the cell positions identified by the method shown in FIG. 19; FIG. 配置位置特定部による、素材コンテンツデータ(画像)が挿入されたカット上における第2のデータフィールドの配置位置を特定する他の例を示す図である。FIG. 10 is a diagram showing another example of specifying the placement position of the second data field on the cut into which the material content data (image) is inserted, by the placement position specifying unit;
 本発明の実施形態の内容を列記して説明する。本発明の実施の形態によるサーバ等は、以下のような構成を備える。
[項目1]
 カットに対して画像データを設定する素材コンテンツデータ設定部と、
 カット上に配置するテキストフィールドの位置を画像データにおける顕著性領域を参照して決定するテキストフィールド配置位置決定部と、
 を備えるサーバ。
[項目2]
 テキストフィールド配置位置決定部は、
 画像データ中に含まれる顕著性領域を判別する顕著性領域判別部と、
 顕著性領域を参照して、カット上のテキストフィールドの配置位置を特定する配置位置特定部と、
 を有する、項目1に記載のサーバ。
[項目3]
 配置位置特定部は、画像中の顕著性領域とテキストフィールドとの関係性が所定の条件を満たす画像データを教師データとして学習して生成されたモデルに基づいて、カット上のテキストフィールドの配置位置を特定するように構成されている、
 項目2に記載のサーバ。
[項目4]
 配置位置特定部は、カット上のテキストフィールドの配置位置を、画像データの各ピクセルについて算出したスコアリング値に基づいて特定するように構成されている、
 項目3に記載のサーバ。
[項目5]
 配置位置特定部は、カット上のテキストフィールドの配置位置を、画像データを区切る複数のセルの各セルについて算出したスコアリング値に基づいて特定するように構成されている、
 項目3に記載のサーバ。
[項目6]
 配置位置特定部は、
 画像データが設定されたカットの全体画像を複数のセルに区切ることと、
 複数のセルのうち、顕著性領域の一部が少なくとも含まれるセルを除外することと、
 除外後に残る複数のセルのうち、顕著性領域との関係に関する所定の条件を満たすセルをテキストフィールドの配置位置として特定することと、
を実行するように構成されている、
 項目2に記載のサーバ。
[項目7]
 顕著性領域は、顕著性物体検出及び顕著性マップ検出を用いたハイブリッド顕著性マップ検出により検出される、
 項目1から6のいずれか1項に記載のサーバ。
[項目8]
 顕著性領域は、顕著性マップ検出により検出される、
 項目1から6のいずれか1項に記載のサーバ。
[項目9]
 顕著性領域は、顕著性物体検出により検出される、
 項目1から6のいずれか1項に記載のサーバ。
[項目10]
 項目1から9のいずれか1つに記載のサーバを備えたシステム。
[項目11]
 コンピュータによって実行されるテキストフィールド配置位置方法であって、
 カットに対してデータを設定するステップと、
 カット上に配置するテキストフィールドの位置を画像データにおける顕著性領域に基づいて決定するステップと、
 を含む、テキストフィールド配置位置方法。
[項目12]
 テキストフィールド配置位置方法をコンピュータに実行させるプログラムであって、
 テキストフィールド配置位置方法は、
 カットに対して画像データを設定するステップと、
 カット上に配置するテキストフィールドの位置を画像データにおける顕著性領域に基づいて決定するステップと、
 を含む、プログラム。
The contents of the embodiments of the present invention are listed and explained. A server or the like according to an embodiment of the present invention has the following configuration.
[Item 1]
a material content data setting unit for setting image data for a cut;
a text field placement position determination unit that determines the position of the text field to be placed on the cut by referring to the saliency region in the image data;
A server with
[Item 2]
The text field placement position determination unit is
a saliency region discriminating unit that discriminates a saliency region included in image data;
a placement position specifying unit that specifies the placement position of the text field on the cut with reference to the saliency region;
A server according to item 1, comprising:
[Item 3]
The placement position specifying unit determines the placement position of the text field on the cut based on a model generated by learning image data that satisfies predetermined conditions for the relationship between the saliency region in the image and the text field as training data. configured to identify the
The server of item 2.
[Item 4]
The arrangement position identifying unit is configured to identify the arrangement position of the text field on the cut based on the scoring value calculated for each pixel of the image data.
A server according to item 3.
[Item 5]
The placement position specifying unit is configured to specify the placement position of the text field on the cut based on the scoring value calculated for each of the plurality of cells that separate the image data.
A server according to item 3.
[Item 6]
The placement position specifying unit
dividing the entire image of the cut in which the image data is set into a plurality of cells;
excluding cells that include at least a portion of the saliency region among the plurality of cells;
Identifying a cell that satisfies a predetermined condition regarding a relationship with the saliency region among the plurality of cells remaining after exclusion as the placement position of the text field;
is configured to run
The server of item 2.
[Item 7]
the saliency regions are detected by hybrid saliency map detection using saliency object detection and saliency map detection;
7. The server according to any one of items 1-6.
[Item 8]
the saliency regions are detected by saliency map detection,
7. The server according to any one of items 1-6.
[Item 9]
The saliency regions are detected by saliency object detection,
7. The server according to any one of items 1-6.
[Item 10]
A system comprising a server according to any one of items 1-9.
[Item 11]
A computer-implemented text field placement method comprising:
setting data for the cut;
determining the position of the text field to be placed on the cut based on the saliency regions in the image data;
Text field placement position method, including.
[Item 12]
A program for causing a computer to execute a text field placement position method,
The text field placement position method is
setting image data for the cut;
determining the position of the text field to be placed on the cut based on the saliency regions in the image data;
program, including
 <実施の形態の詳細>
 以下、本発明の実施の形態による複合コンテンツデータを作成するためのシステム(以下「本システム」という)等について説明する。添付図面において、同一または類似の要素には同一または類似の参照符号及び名称が付され、各実施形態の説明において同一または類似の要素に関する重複する説明は省略することがある。また、各実施形態で示される特徴は、互いに矛盾しない限り他の実施形態にも適用可能である。
<Details of Embodiment>
A system for creating composite content data (hereinafter referred to as "this system") and the like according to an embodiment of the present invention will now be described. In the accompanying drawings, the same or similar elements are denoted by the same or similar reference numerals and names, and duplicate descriptions of the same or similar elements may be omitted in the description of each embodiment. Also, the features shown in each embodiment can be applied to other embodiments as long as they are not mutually contradictory.
 <構成>
 実施形態例に係る本システムは、図1に示すように、サーバ1と、管理者端末2と、ユーザ端末3とを備えて構成される。サーバ1と、管理者端末2と、ユーザ端末3とは、ネットワークNを介して互いに通信可能に接続されている。ネットワークNは、ローカルネットワークであってもよいし、外部ネットワークに接続可能なものであってもよい。図1の例では、サーバ1を1台で構成する例を説明しているが、複数台のサーバ装置によりサーバ1を実現することも可能である。また、サーバ1と管理者端末2とが共通化されていてもよい。
<Configuration>
As shown in FIG. 1, the system according to the embodiment includes a server 1, an administrator terminal 2, and a user terminal 3. FIG. A server 1, an administrator terminal 2, and a user terminal 3 are connected via a network N so as to be able to communicate with each other. Network N may be a local network, or may be connectable to an external network. In the example of FIG. 1, an example in which the server 1 is composed of one unit is described, but it is also possible to realize the server 1 using a plurality of server devices. Also, the server 1 and the administrator terminal 2 may be shared.
 <サーバ1>
 図2は、図1に記載のサーバ1のハードウェア構成を示す図である。なお、図示された構成は一例であり、これ以外の構成を有していてもよい。また、サーバ1は、例えばワークステーションやパーソナルコンピュータのような汎用コンピュータとしてもよいし、或いはクラウド・コンピューティングによって論理的に実現されてもよい。
<Server 1>
FIG. 2 is a diagram showing the hardware configuration of the server 1 shown in FIG. 1. As shown in FIG. Note that the illustrated configuration is an example, and other configurations may be employed. Also, the server 1 may be a general-purpose computer such as a workstation or a personal computer, or may be logically realized by cloud computing.
 サーバ1は、少なくとも、プロセッサ10、メモリ11、ストレージ12、送受信部13、入出力部14等を備え、これらはバス15を通じて相互に電気的に接続される。 The server 1 includes at least a processor 10 , a memory 11 , a storage 12 , a transmission/reception section 13 , an input/output section 14 and the like, which are electrically connected to each other through a bus 15 .
 プロセッサ10は、サーバ1全体の動作を制御し、各要素間におけるデータの送受信の制御、及びアプリケーションの実行及び認証処理に必要な情報処理等を行う演算装置である。例えばプロセッサ10はCPU(Central Processing Unit)およびGPU(Graphics Processing Unit)を備え、ストレージ12に格納されメモリ11に展開された本システムのためのプログラム等を実行して各情報処理を実施する。なお、プロセッサ10の処理能力は、必要な情報処理を実行するために十分であればよいので、例えば、プロセッサ10はCPUのみで構成されていてもよいし、これに限るものでもない。 The processor 10 is an arithmetic device that controls the overall operation of the server 1, controls transmission and reception of data between elements, executes applications, and performs information processing necessary for authentication processing. For example, the processor 10 includes a CPU (Central Processing Unit) and a GPU (Graphics Processing Unit), and executes programs for this system stored in the storage 12 and developed in the memory 11 to perform each information process. It should be noted that the processing capability of the processor 10 only needs to be sufficient for executing necessary information processing, so for example, the processor 10 may be composed only of a CPU, and is not limited to this.
 メモリ11は、DRAM(Dynamic Random Access Memory)等の揮発性記憶装置で構成される主記憶と、フラッシュメモリやHDD(Hard Disc Drive)等の不揮発性記憶装置で構成される補助記憶と、を含む。メモリ11は、プロセッサ10のワークエリア等として使用され、また、サーバ1の起動時に実行されるBIOS(Basic Input / Output System)、及び各種設定情報等を格納してもよい。 The memory 11 includes a main memory composed of a volatile memory device such as a DRAM (Dynamic Random Access Memory), and an auxiliary memory composed of a non-volatile memory device such as a flash memory or a HDD (Hard Disc Drive). . The memory 11 is used as a work area or the like for the processor 10, and may store a BIOS (Basic Input/Output System) executed when the server 1 is started, various setting information, and the like.
 ストレージ12は、アプリケーション・プログラム等の各種プログラムを格納する。各処理に用いられるデータを格納したデータベースがストレージ12に構築されていてもよい。特に本実施形態においては、ストレージ12には、図6を参照して説明する複合コンテンツデータの作成方法をサーバ1に実行させるためのコンピュータプログラムが格納されており、さらには、図8等を参照して説明する第2のデータ配置位置決定方法をサーバ1に実行させるためのコンピュータプログラムが格納されている。 The storage 12 stores various programs such as application programs. A database storing data used for each process may be constructed in the storage 12 . Particularly in this embodiment, the storage 12 stores a computer program for causing the server 1 to execute the composite content data creation method described with reference to FIG. A computer program is stored for causing the server 1 to execute the second data arrangement position determination method described below.
 送受信部13は、サーバ1をネットワークに接続する。 The transmission/reception unit 13 connects the server 1 to the network.
 入出力部14は、キーボード・マウス類等の情報入力機器、及びディスプレイ等の出力機器である。 The input/output unit 14 is an information input device such as a keyboard and mouse, and an output device such as a display.
 バス15は、上記各要素に共通に接続され、例えば、アドレス信号、データ信号及び各種制御信号を伝達する。 A bus 15 is commonly connected to the above elements and transmits, for example, address signals, data signals and various control signals.
<管理者端末2、ユーザ端末3>
 図3に示される管理者端末2、ユーザ端末3もまた、プロセッサ20、メモリ21、ストレージ22、送受信部23、入出力部24等を備え、これらはバス25を通じて相互に電気的に接続される。各要素の機能は、上述したサーバ1と同様に構成することが可能であることから、各要素の詳細な説明は省略する。管理者は、管理者端末2により、例えばサーバ1の設定変更やデータベースの運用管理などを行う。ユーザは、ユーザ端末3によりサーバ1にアクセスして、例えば、複合コンテンツデータを作成または閲覧することなどができる。
<Administrator Terminal 2, User Terminal 3>
The administrator terminal 2 and the user terminal 3 shown in FIG. 3 also include a processor 20, a memory 21, a storage 22, a transmission/reception section 23, an input/output section 24, etc. These are electrically connected to each other through a bus 25. . Since the function of each element can be configured in the same manner as the server 1 described above, detailed description of each element will be omitted. The administrator uses the administrator terminal 2 to, for example, change the settings of the server 1 and manage the operation of the database. A user can access the server 1 from the user terminal 3 to create or view composite content data, for example.
<サーバ1の機能>
 図4は、サーバ1に実装される機能を例示したブロック図である。本実施の形態においては、サーバ1は、通信部110と、記憶部160と、素材コンテンツデータ設定部190とを備えている。素材コンテンツデータ設定部190は、被識別情報解析部120、第2のデータ生成部130、複合コンテンツデータ生成部140、関連付け部150、分類器170、第2のデータ配置位置決定部180を含んでいる。複合コンテンツデータ生成部140はベースデータ生成部142、第2のデータ割り当て部144、素材コンテンツデータ割り当て部146を含む。また、記憶部160は、メモリ11やストレージ11等の記憶領域より構成されており、ベースデータ記憶部161、素材コンテンツデータ記憶部163、複合コンテンツデータ記憶部165、インターフェース情報記憶部167などの各種データベースを含む。第2のデータ配置位置決定部180は、素材コンテンツデータ中の顕著性を有する物体あるいは範囲に関する顕著性領域を判別する顕著性判別部182と、第2のデータフィールドを配置する位置を特定する配置位置特定部184とを含む。素材コンテンツデータ設定部190を構成する各部120,130,140,150,170,180の機能は、例えば1または複数のプロセッサ10により実現することも可能である。
<Functions of server 1>
FIG. 4 is a block diagram illustrating functions implemented in the server 1. As shown in FIG. In this embodiment, the server 1 includes a communication section 110 , a storage section 160 and a material content data setting section 190 . Material content data setting unit 190 includes identified information analysis unit 120, second data generation unit 130, composite content data generation unit 140, association unit 150, classifier 170, and second data placement position determination unit 180. there is Composite content data generator 140 includes base data generator 142 , second data allocation unit 144 , and material content data allocation unit 146 . The storage unit 160 is composed of storage areas such as the memory 11 and the storage 11, and includes a base data storage unit 161, a material content data storage unit 163, a compound content data storage unit 165, an interface information storage unit 167, and other various storage areas. Contains databases. A second data placement position determination unit 180 includes a saliency determination unit 182 that determines a saliency region related to an object or range having salience in the material content data, and a placement data field that specifies the position where the second data field is to be placed. and a locator 184 . The functions of the units 120, 130, 140, 150, 170, and 180 that make up the material content data setting unit 190 can be implemented by one or more processors 10, for example.
 通信部110は、管理者端末2や、ユーザ端末3と通信を行う。通信部110は、ユーザ端末3から、例えば被識別情報を含む第1のデータを受け付ける受付部としても機能する。そして、第1のデータは、例えば、被識別情報を含む記事(例えば、プレスリリースや、ニュースなど)などのテキストデータ、被識別情報を含む画像データ(例えば、写真や、イラストなど)若しくは動画データ、被識別情報を含む音声データなどであってもよい。なお、ここでいうテキストデータは、サーバ1に送信された時点においてテキストデータであるものに限らず、例えば、サーバ1に送信された音声データを既知の音声認識技術により生成されたテキストデータであってもよい。また、第1のデータは、例えば記事などのテキストデータなどが、既存の抽出的要約若しくは生成的要約などの自動要約技術により要約されたもの(被識別情報を含む)であってもよく、その場合、ベースデータに含まれるカット数が減り、複合コンテンツデータ全体のデータ容量を小さくすることができ、内容も簡潔なものとなり得る。 The communication unit 110 communicates with the administrator terminal 2 and the user terminal 3. The communication unit 110 also functions as a reception unit that receives first data including information to be identified, for example, from the user terminal 3 . The first data is, for example, text data such as articles containing information to be identified (for example, press releases, news, etc.), image data containing information to be identified (for example, photographs, illustrations, etc.), or video data. , voice data including information to be identified, and the like. Note that the text data here is not limited to text data at the time of transmission to the server 1, but may be text data generated by a known voice recognition technique from voice data transmitted to the server 1, for example. may In addition, the first data may be text data such as articles, etc., summarized by existing automatic summarization technology such as extractive summary or generative summary (including information to be identified). In this case, the number of cuts included in the base data is reduced, the data volume of the entire composite content data can be reduced, and the content can be simplified.
 また、ここでいう音声データは、マイク等の入力装置により取得された音声データに限らず、動画データから抽出された音声データや、テキストデータから生成された音声データであってもよい。前者の場合、例えばラフスケッチなどの仮画像及び仮映像による動画といった仮動画から、ナレーションやセリフなどの音声データだけを抽出し、後述されるように当該音声データを基に素材コンテンツデータと共に複合コンテンツデータを生成するようにしてもよい。後者の場合、例えば、ストーリーのあるテキストデータから音声データを作成し、例えば童話であれば、読み上げられたストーリーと素材コンテンツデータによる紙芝居や動画を複合コンテンツデータとして生成するようにしてもよい。 Also, the audio data referred to here is not limited to audio data acquired by an input device such as a microphone, but may be audio data extracted from video data or audio data generated from text data. In the former case, only audio data such as narration and lines are extracted from temporary images such as rough sketches and temporary moving images such as temporary video, and composite content is extracted along with material content data based on the audio data as will be described later. Data may be generated. In the latter case, for example, voice data may be created from text data with a story, and in the case of fairy tales, for example, a picture-story show or moving image based on the read-out story and material content data may be generated as composite content data.
 第2のデータ生成部130は、例えば第1のデータを分割する必要がないと判定した場合(例えば、テキストデータが予め設定された文字数以下の短文であったりするなど)には、第2のデータ生成部130は、そのまま第1のデータを第2のデータとして生成する。一方で、例えば第1のデータを分割する必要があると判定した場合(例えば、予め設定された文字数よりも長文であったりするなど)には、第2のデータ生成部130は、第1のデータを分割し、それぞれ第1のデータの被識別情報の少なくとも一部を含む第2のデータとして生成する。この時、併せて第2データの分割数情報についても生成する。なお、第2のデータ生成部130による第1のデータ分割の方法は、既知の何れの技術を利用してもよく、例えば、第1のデータがテキスト化できるものであれば、予め設定されたベースデータの各カットの最大文字数や文節間の修飾関係の解析結果に基づき、文章として自然な区間が各カットに収まるように文を区切るようにしてもよい。 For example, when the second data generation unit 130 determines that it is not necessary to divide the first data (for example, the text data is a short sentence with a preset number of characters or less), the second data generation unit 130 The data generator 130 generates the first data as it is as the second data. On the other hand, for example, when it is determined that the first data needs to be divided (for example, the sentence is longer than the preset number of characters), the second data generation unit 130 divides the first data. The data is divided and generated as second data each including at least part of the information to be identified of the first data. At this time, division number information of the second data is also generated. Any known technique may be used for the method of dividing the first data by the second data generation unit 130. For example, if the first data can be converted into text, Based on the analysis results of the maximum number of characters in each cut of the base data and the modification relationship between clauses, sentences may be separated so that a natural section as a sentence fits into each cut.
 被識別情報解析部120は、上述の第2のデータを解析し、被識別情報を取得する。ここで、被識別情報は、被識別情報解析部120により解析可能であれば、どのような情報であってもよい。一つの態様としては、被識別情報は、言語モデルにより定義された単語形式であり得る。より具体的には、後述の単語ベクトルを伴う一以上の単語(例えば、「渋谷、新宿、六本木」や「渋谷、ランドマーク、若者」など)であってもよい。なお、当該単語には、言語モデルに応じて「ん」などの通常はそれ単体では利用されない単語も含み得る。また、上記単語形式の代わりに文全体を表すベクトルを伴う文書、または画像や動画から抽出された特徴ベクトルであってもよい。 The identified information analysis unit 120 analyzes the second data described above and acquires identified information. Here, the information to be identified may be any information as long as it can be analyzed by the information to be identified analysis unit 120 . In one aspect, the identified information may be in word form defined by a language model. More specifically, it may be one or more words (for example, "Shibuya, Shinjuku, Roppongi" or "Shibuya, Landmark, Youth") accompanied by a word vector, which will be described later. Note that the words may include words that are not usually used alone, such as "n", depending on the language model. Also, a feature vector extracted from a document, an image, or a moving image may be used instead of the above-described word format.
 複合コンテンツデータ生成部140は、上述の第2のデータ生成部130により生成された第2データの分割数情報に応じた数のカット(一以上のカット)を含むベースデータをベースデータ生成部142により生成し、ユーザ端末3から新たに入力された素材コンテンツデータおよび/または素材コンテンツデータ記憶部163に記憶された素材コンテンツデータと上述の第2のデータが各カットに割り当てられたベースデータを複合コンテンツデータとして生成するとともに複合コンテンツデータ記憶部165に記憶し、ユーザ端末3に複合コンテンツデータを表示する。ベースデータ生成部142は、生成した一以上のカットに、例えばシーン1、シーン2、シーン3やカット1、カット2、カット3といったように番号を付与する。 The composite content data generation unit 140 generates base data including the number of cuts (one or more cuts) according to the division number information of the second data generated by the second data generation unit 130 described above. and the material content data newly input from the user terminal 3 and/or the material content data stored in the material content data storage unit 163 and the base data in which the above-described second data is assigned to each cut are combined. The composite content data is generated as content data, stored in the composite content data storage unit 165 , and displayed on the user terminal 3 . The base data generation unit 142 assigns numbers to the generated one or more cuts, such as scene 1, scene 2, scene 3 or cut 1, cut 2, cut 3, for example.
 図5は、ベースデータを構成するカットの画面レイアウトの一例である。同図中、図(a)に示す例では、編集された第2のデータ(例えば、区切られたテキスト文章など)がテキストデータフィールドである第2のデータフィールド31に挿入され、素材コンテンツデータフィールド32に選択された素材コンテンツデータが挿入される。なお、図5(a)に示した例では第2のデータフィールド31と素材コンテンツデータフィールド32とが区分けされているが、これとは異なり、図5(b)に示すように、素材コンテンツデータフィールド32上に重畳するように第2のデータフィールド31が挿入されてもよい。ただし、第2のデータフィールド31は素材コンテンツデータ中の顕著性領域に重ならないように配置する必要がある。 Fig. 5 is an example of a screen layout of cuts that make up the base data. In the example shown in FIG. 4A, the edited second data (for example, delimited text sentences) is inserted into the second data field 31, which is a text data field, and the material content data field The material content data selected in 32 is inserted. In the example shown in FIG. 5(a), the second data field 31 and the material content data field 32 are separated. A second data field 31 may be inserted to overlay the field 32 . However, the second data field 31 must be arranged so as not to overlap the saliency region in the material content data.
 ベースデータの各カットには、予め設定されている上述の最大文字数(テキストデータの場合)や、画面レイアウト、再生時間(動画の場合)が規定されていてもよい。また、複合コンテンツデータは、必ずしも複合コンテンツデータ記憶部165に保存される必要はなく、適当なタイミングで記憶されてもよい。また、第2のデータのみが割り当てられたベースデータを複合コンテンツデータの経過情報としてユーザ端末3に表示するようにしてもよい。 For each cut of the base data, the preset maximum number of characters (in the case of text data), screen layout, and playback time (in the case of video) may be stipulated. Also, composite content data does not necessarily need to be stored in the composite content data storage unit 165, and may be stored at appropriate timing. Also, the base data to which only the second data is assigned may be displayed on the user terminal 3 as progress information of the composite content data.
 再び図4を参照すると、第2のデータ割り当て部144は、上述のベースデータ生成部142により生成された一以上のカットに付与した番号順に、第2のデータを順次割り当てていく。 Referring to FIG. 4 again, the second data allocation unit 144 sequentially allocates the second data in the order of numbers assigned to one or more cuts generated by the base data generation unit 142 described above.
 関連付け部150は、上述の第2のデータに含まれる被識別情報の少なくとも一部と、例えば、素材コンテンツデータから抽出される抽出情報(例えば、分類器が抽出したクラスラベルなど)と比較し、例えば、互いの類似度などを判定して、第2のデータに適した素材コンテンツデータ(例えば、類似度が高いものなど)と第2のデータとを互いに関連付ける。より具体的な例としては、例えば、第2のデータに含まれる被識別情報が「先生」を表し、抽出情報が「顔」である素材コンテンツデータA(例えば、女性の画像)と「山」である素材コンテンツデータB(例えば、富士山の画像)が用意されている場合、「先生」から得られる単語ベクトルと「顔」から得られる単語ベクトルの関連は、「先生」から得られる単語ベクトルと「山」から得られる単語ベクトルの関連よりも類似しているため、第2のデータは素材コンテンツデータAと関連付けられる。なお、素材コンテンツデータの抽出情報は、ユーザが予め抽出して素材コンテンツデータ記憶部163に記憶したものであってもよく、後述の分類器170により抽出されたものであってもよい。また、上記類似度の判定は、単語ベクトルを学習した学習済モデルを用意し、そのベクトルを利用してコサイン類似度やWord Mover’s Distanceなどの方法により単語の類似度を判定してもよい。 The association unit 150 compares at least part of the information to be identified included in the second data described above with, for example, extracted information extracted from the material content data (for example, class labels extracted by the classifier), For example, mutual similarity or the like is determined, and material content data suitable for the second data (for example, data having a high degree of similarity) and the second data are associated with each other. As a more specific example, for example, material content data A (for example, an image of a woman) whose identified information included in the second data represents "teacher" and extracted information is "face" and "mountain". is prepared (for example, an image of Mt. Fuji), the relationship between the word vector obtained from "teacher" and the word vector obtained from "face" is the word vector obtained from "teacher" and The second data is associated with the material content data A because it is more similar than the association of word vectors obtained from "mountain". The extraction information of the material content data may be extracted in advance by the user and stored in the material content data storage unit 163, or may be extracted by the classifier 170, which will be described later. In addition, the similarity determination may be performed by preparing a trained model that has learned word vectors, and using the vectors to determine the similarity of words by a method such as cosine similarity or Word Mover's Distance.
 素材コンテンツデータは、例えば、画像データや、動画データ、音データ(例えば、音楽データ、音声データ、効果音など)などであり得るが、これに限定されない。また、素材コンテンツデータは、ユーザまたは管理者が素材コンテンツデータ記憶部163に予め格納したものであってもよいし、ネットワーク上から素材コンテンツデータを取得して素材コンテンツデータ記憶部163に格納したものであってもよい。 Material content data can be, for example, image data, video data, sound data (eg, music data, voice data, sound effects, etc.), but is not limited to this. The material content data may be stored in advance in the material content data storage unit 163 by the user or administrator, or may be acquired from the network and stored in the material content data storage unit 163. may be
 素材コンテンツデータ割り当て部146は、上述の関連付けに基づき、対応する第2のデータが割り当てられたカットに、適した素材コンテンツデータを割り当てる。 The material content data allocation unit 146 allocates suitable material content data to cuts to which the corresponding second data is allocated, based on the above-described association.
 インターフェース情報記憶部167は、管理者端末2若しくはユーザ端末3の表示部(ディスプレイ等)に表示するための各種制御情報を格納している。 The interface information storage unit 167 stores various control information to be displayed on the display unit (display, etc.) of the administrator terminal 2 or the user terminal 3.
 分類器170は、学習データを学習データ記憶部(不図示)から取得し、機械学習させることで、学習済モデルとして作成される。分類器170の作成は、定期的に行われる。分類器作成用の学習データは、ネットワークから収集したデータやユーザ保有のデータにクラスラベルをつけたものを利用してもよいし、クラスラベルのついたデータセットを調達して利用してもよい。そして、分類器170は、例えば、畳み込みニューラルネットワークを利用した学習済モデルであり、素材コンテンツデータを入力すると、1つまたは複数の抽出情報(例えば、クラスラベルなど)を抽出する。分類器170は、例えば、素材コンテンツデータに関連するオブジェクトを表すクラスラベル(例えば、魚介、焼肉、人物、家具)を抽出する。 The classifier 170 acquires learning data from a learning data storage unit (not shown) and performs machine learning to create a trained model. Creation of the classifier 170 occurs periodically. The learning data for creating a classifier may be data collected from the network or data owned by the user with class labels attached, or a data set with class labels may be procured and used. . The classifier 170 is, for example, a trained model using a convolutional neural network, and upon input of material content data, extracts one or a plurality of extracted information (eg, class labels, etc.). The classifier 170, for example, extracts class labels representing objects associated with the material content data (eg, seafood, grilled meat, people, furniture).
 図6は、複合コンテンツデータを作成する流れの一例を説明する図である。 FIG. 6 is a diagram explaining an example of the flow of creating composite content data.
 まず、サーバ1は、少なくとも被識別情報を含む第1のデータをユーザ端末3より通信部110を介して受け付ける(ステップS101)。本例においては、被識別情報は、例えば一以上の単語であり、第1のデータは、例えば一以上の単語を含む記事からなるテキストデータまたはそのテキストデータを要約したものであり得る。 First, the server 1 receives first data including at least identification information from the user terminal 3 via the communication unit 110 (step S101). In this example, the identified information is, for example, one or more words, and the first data may be, for example, text data consisting of an article containing one or more words or a summary of the text data.
 次に、サーバ1は、被識別情報解析部120により、第1のデータを解析して被識別情報を取得し、第2のデータ生成部130により、被識別情報の少なくとも一部を含む一以上の第2のデータ及び分割数情報を生成する(ステップS102)。 Next, the server 1 acquires identified information by analyzing the first data by the identified information analysis unit 120, and generates one or more data containing at least part of the identified information by the second data generation unit 130. second data and division number information are generated (step S102).
 次に、サーバ1は、複合コンテンツデータ生成部140により、上述の分割数情報に応じた数のカットを含むベースデータをベースデータ生成部142により生成する(ステップS103)。 Next, the server 1 causes the base data generation section 142 to generate the base data including the number of cuts according to the division number information by the composite content data generation section 140 (step S103).
 次に、サーバ1は、第2のデータ割り当て部により、第2のデータをカットに割り当てる(ステップS104)。なお、この状態のベースデータをユーザ端末3にて表示をするようにして、経過を確認可能にしてもよい。 Next, the server 1 allocates the second data to the cut by the second data allocation unit (step S104). The base data in this state may be displayed on the user terminal 3 so that the progress can be checked.
 次に、サーバ1は、第2のデータに含まれる被識別情報の少なくとも一部と、素材コンテンツデータから抽出された抽出情報に基づき、関連付け部150により、素材コンテンツデータ記憶部163の素材コンテンツデータと第2のデータとを互いに関連付けし(ステップS105)、素材コンテンツデータ割り当て部146によりその素材コンテンツデータをカットに割り当てる(ステップS106)。 Next, based on at least part of the information to be identified included in the second data and the extracted information extracted from the material content data, the server 1 causes the association unit 150 to extract the material content data in the material content data storage unit 163. and the second data (step S105), and the material content data allocation unit 146 allocates the material content data to the cut (step S106).
 次に、サーバ1は、第2のデータ配置位置決定部180により、素材コンテンツデータから検出される、当該素材コンテンツデータの画像中の顕著性を有する物体・範囲に関する顕著性領域に基づき、各カット上に配置する第2のデータフィールド31の配置位置を決定する(ステップ107)。 Next, the server 1 uses the second data placement position determining unit 180 to determine each cut based on the saliency region related to the object/range having salience in the image of the material content data, which is detected from the material content data. The placement position of the second data field 31 to be placed above is determined (step 107).
 そして、サーバ1は、第2のデータ及び素材コンテンツデータが割り当てられたベースデータを複合コンテンツデータとして生成するとともに複合コンテンツデータ記憶部165に記憶し、ユーザ端末3に複合コンテンツデータを表示する(ステップS108)。各カットにおける第2のデータフィールド31の配置位置は上記のように第2のデータ配置位置決定部180によって決定されており、サーバ1は、第2のデータ配置位置決定部180によって決定された配置位置に従って各カットに第2のデータフィールド31を挿入する。なお、複合コンテンツデータの表示は、図7に例示するように、複合コンテンツデータを構成する複数のカットを画面上に一覧表示することができる。各カットには、表示される素材コンテンツデータおよび第2データと共に各カットの再生時間(秒数)の情報も表示されてもよい。ユーザは、例えば、第2のデータフィールド31や対応するボタンをクリックすることで、その内容を修正することができ、素材コンテンツデータフィールド32や対応するボタンをクリックすることで素材コンテンツデータを差し替えることができる。さらに、他の素材コンテンツデータをユーザがユーザ端末から各シーンに追加することも可能である。 Then, the server 1 generates the base data to which the second data and the material content data are assigned as composite content data, stores the composite content data in the composite content data storage unit 165, and displays the composite content data on the user terminal 3 (step S108). The arrangement position of the second data field 31 in each cut is determined by the second data arrangement position determination section 180 as described above, and the server 1 determines the arrangement determined by the second data arrangement position determination section 180. A second data field 31 is inserted into each cut according to its position. As for the display of composite content data, as shown in FIG. 7, a list of a plurality of cuts forming the composite content data can be displayed on the screen. For each cut, along with the displayed material content data and second data, information on the playback time (in seconds) of each cut may also be displayed. The user can, for example, correct the content by clicking the second data field 31 or the corresponding button, and replace the material content data by clicking the material content data field 32 or the corresponding button. can be done. Furthermore, it is also possible for the user to add other material content data to each scene from the user terminal.
 なお、上述の複合コンテンツデータを作成する流れは一例であり、例えば、ベースデータを読み出すためのステップS102は、第2のデータまたは素材コンテンツデータの割り当てまでに読み出されていればいつ実行されていてもよい。また、例えば、第2のデータの割り当てのためのステップS104と、関連付けのためのステップS105と、素材コンテンツデータの割り当てのためのステップS106と、第2のデータフィールド31の配置位置の決定のためのステップの順番も、互いに齟齬が生じなければ何れの順番で実行されてもよい。 It should be noted that the flow of creating composite content data described above is just an example, and for example, step S102 for reading the base data may be executed as long as it has been read before the assignment of the second data or material content data. may Further, for example, step S104 for assigning the second data, step S105 for association, step S106 for assigning material content data, and determination of the arrangement position of the second data field 31 The order of the steps may be executed in any order as long as there is no discrepancy with each other.
 また、これまで説明した被識別情報解析部120及び関連付け部150、分類器170、第2のデータ配置位置決定部180を用いた素材コンテンツデータ設定部190は、複合コンテンツデータ作成システムの1つの設定機能であってもよく、素材コンテンツデータ設定部190による設定方法はこれに限らない。例えば、ベースデータは上述の例ではベースデータ生成部142により生成されているが、これに代えてベースデータ記憶部161から読み出すようにしてもよい。読み出されたベースデータは、例えば所定の数の空白カットを含むものであってもよいし、所定の素材コンテンツデータや書式情報などが各カットに設定済みのテンプレートデータ(例えば、音楽データや背景画像、フォント情報などが設定されている)であってもよい。さらに、従来の複合コンテンツデータ作成システムと同様に、ユーザ端末からユーザが各データフィールドの全てまたは一部に対して任意の素材コンテンツを設定可能にしてもよいし、例えば第2のデータフィールド31にユーザがユーザ端末により任意のテキストを入力し、これらのテキストから上述のように被識別情報を抽出して素材コンテンツを関連付けるといったように、ユーザ操作と組み合わせた設定方法であってもよい。 In addition, the material content data setting unit 190 using the identified information analysis unit 120, the association unit 150, the classifier 170, and the second data placement position determination unit 180 described above is one setting of the composite content data creation system. It may be a function, and the setting method by the material content data setting unit 190 is not limited to this. For example, the base data is generated by the base data generation unit 142 in the above example, but it may be read from the base data storage unit 161 instead. The read-out base data may include, for example, a predetermined number of blank cuts, or template data in which predetermined material content data, format information, etc. have been set for each cut (for example, music data, background data, etc.). image, font information, etc.) may be used. Furthermore, as in the conventional composite content data creation system, the user may be able to set any material content to all or part of each data field from the user terminal. A setting method may be combined with a user operation, such as a user inputting arbitrary text using a user terminal, extracting information to be identified from these texts as described above, and associating material content.
(第2のデータフィールド配置位置の決定)
 次に、図9~図18を参照して、本実施形態における第2のデータ配置位置決定部180による、各カット上の第2のデータフィールド配置位置の決定方法の一例について説明する。第2のデータフィールド配置位置の決定方法は、上述のステップ107において実施される。
(Determination of second data field placement position)
Next, an example of a method of determining the second data field placement position on each cut by the second data placement position determining unit 180 in this embodiment will be described with reference to FIGS. 9 to 18. FIG. A second data field placement location determination method is performed in step 107 above.
 図8は、第2のデータフィールド配置位置の決定方法の流れの一例を説明する図である。図8に示すように、第2のデータフィールド配置位置の決定方法は、顕著性領域判別部182により、各カットに設定された素材コンテンツデータ(画像データ)に関する顕著性領域を判別するステップ(S201)と、配置位置特定部184により、当該画像の顕著性領域を参照して、各カット上の第2のデータフィールド31の配置位置を特定するステップ(S202)とを含む。 FIG. 8 is a diagram explaining an example of the flow of the second data field arrangement position determination method. As shown in FIG. 8, in the second data field arrangement position determination method, the saliency region determining unit 182 determines a saliency region related to the material content data (image data) set for each cut (S201). ), and a step of specifying the placement position of the second data field 31 on each cut by referring to the saliency region of the image by the placement position specifying unit 184 (S202).
 最初に、顕著性領域判別部182による画像の顕著性領域の判別について説明する。顕著性領域判別部182は、図9に示す顕著性物体検出や図12に示す顕著性マップ検出などのような既知の学習方法により得られる、顕著性に関する学習済みモデルである顕著性判定モデルを用いて、画像中において顕著性を有する物体や範囲(以下、「顕著性領域」とも称する。)を判別する。顕著性判定モデルは、例えば記憶部160のモデル記憶部169に格納される。顕著性領域判別部182は、図9及び図11~13に示す画像に例示されるような顕著性情報に基づき、画像中の顕著性領域を判別する。 First, the determination of the saliency region of the image by the saliency region determination unit 182 will be described. The saliency region determination unit 182 uses a saliency determination model, which is a trained model regarding salience, obtained by a known learning method such as the saliency object detection shown in FIG. 9 or the saliency map detection shown in FIG. are used to discriminate objects and ranges (hereinafter also referred to as “saliency regions”) having saliency in the image. The saliency determination model is stored in the model storage unit 169 of the storage unit 160, for example. The saliency region determination unit 182 determines saliency regions in images based on saliency information as exemplified in the images shown in FIGS. 9 and 11 to 13 .
 図9は、顕著性物体検出モデルを用いて画像中の顕著性を有する物体を検出する例を示す。顕著性物体検出モデルを用いた顕著性物体検出は、例えばエンコーダ・デコーダモデルなどの既知の手法を用いて実現することができる。図9に示す例では、画像中における顕著性物体として図9中に点線で囲まれた大小の山々が検出され、顕著性領域判別部182はこれらの大小の山々を顕著性物体として判別する。 FIG. 9 shows an example of detecting a saliency object in an image using a saliency object detection model. Saliency object detection using a saliency object detection model can be realized using a known technique such as an encoder-decoder model. In the example shown in FIG. 9, large and small mountains surrounded by dotted lines in FIG. 9 are detected as saliency objects in the image, and the saliency region discriminating unit 182 discriminates these large and small mountains as saliency objects.
 図10及び図11は、顕著性物体検出モデルを用いて画像中の顕著性を有する物体を検出する他の例を示す。例えば、図10に示す動物を含む画像に対して顕著性物体検出モデルを用いて顕著性物体検出を行った場合には、当該画像中における顕著性物体として、図11中に比較的明るい領域で示されるような動物の外形形状が検出され、顕著性領域判別部182はこの外形形状を示す領域を顕著性物体として判別する。 10 and 11 show another example of detecting objects with salience in an image using the saliency object detection model. For example, when saliency object detection is performed on an image including animals shown in FIG. 10 using a saliency object detection model, a relatively bright region in FIG. The contour shape of the animal as shown is detected, and the saliency region discriminating section 182 discriminates the region showing this contour shape as a saliency object.
 また、図12は顕著性マップ検出モデルを用いて画像中の顕著性を有する範囲を検出する例を示す。顕著性マップ検出モデルを用いた顕著性範囲検出は、例えば、入力画像に基づいて畳み込みニューラルネットワークを用いて生成した特徴マップに学習済モデルを適用するなどの既知の手法を用いて実現することができる。顕著性マップ検出モデルを用いた顕著性範囲検出においては、画像中の各ピクセルの視覚的顕著性の強さを判別することにより、顕著性マップが生成される。図12に示す顕著性マップの例では、例示として、黒い部分の濃さが視覚的顕著性の強さを表現している。顕著性領域判別部182は、視覚的顕著性が比較的強いピクセルが占有する割合が大きい範囲(図12に示す例では、大小の山々のうち大きい山々の領域)を顕著性範囲として判別する。 Also, FIG. 12 shows an example of detecting a saliency range in an image using a saliency map detection model. Saliency range detection using a saliency map detection model can be realized using a known technique such as applying a trained model to a feature map generated using a convolutional neural network based on an input image. can. In saliency range detection using a saliency map detection model, a saliency map is generated by determining the visual saliency strength of each pixel in the image. In the example of the saliency map shown in FIG. 12, by way of illustration, the darkness of the black portion expresses the strength of visual salience. The saliency region discriminating unit 182 discriminates a range occupied by a large proportion of pixels with relatively strong visual saliency (in the example shown in FIG. 12, a region of large mountains among large and small mountains) as a saliency range.
 図13は、顕著性マップ検出モデルを用いて画像中の顕著性を有する範囲を検出する他の例を示す。例えば、図10に示す動物を含む画像に対して顕著性マップ検出モデルを用いて顕著性範囲検出を行った場合には、顕著性領域判別部182は、当該画像中における顕著性範囲として、図13中に比較的明るい領域で示されるような動物の外形形状を示す範囲を顕著性範囲として判別する。この顕著性範囲の例では動物の顔に対応する部分に視覚的顕著性が強く検出されており、その部分の顕著性が特に高いことを示している。 FIG. 13 shows another example of detecting a saliency range in an image using a saliency map detection model. For example, when saliency range detection is performed using the saliency map detection model for the image including animals shown in FIG. The range showing the outer shape of the animal as shown by the relatively bright region in 13 is discriminated as the saliency range. In this example of the saliency range, strong visual saliency is detected in the portion corresponding to the animal's face, indicating that the saliency of that portion is particularly high.
 次に、配置位置特定部184による第2のデータフィールド31の配置位置特定について説明する。配置位置特定部184は、画像中の顕著性領域と第2のデータフィールド31との配置位置の関係性を機械学習させて得られる学習済みモデルである配置位置判定モデルを用いて、素材コンテンツデータ(画像)が挿入されたカット上における第2のデータフィールド31の配置位置を特定する。配置位置判定モデルも顕著性判定モデルと同様に、例えば記憶部160のモデル記憶部169に格納される。 Next, the placement position specification of the second data field 31 by the placement position specifying unit 184 will be described. The placement position specifying unit 184 uses a placement position determination model, which is a learned model obtained by machine learning the relationship between the placement positions of the saliency regions in the image and the second data field 31, to determine the material content data. The arrangement position of the second data field 31 on the cut in which the (image) is inserted is specified. The arrangement position determination model is also stored, for example, in the model storage unit 169 of the storage unit 160 in the same manner as the saliency determination model.
 配置位置判定モデルは、例えば、テキストが付加されている画像であって、画像中の顕著性領域とテキストの配置位置との関係が良好であると認められる画像を教師データとして、任意の学習器を用いて機械学習させることで生成することが可能である。配置位置判定モデルを生成する学習器は、教師データの画像から顕著性領域とテキストとを抽出し、画像中のそれらの相対的な位置関係を学習する。教師データとして用いられる画像は、画像中の顕著性領域とテキストの配置位置との関係性の条件として例えば下記の観点に基づいて選択することが好ましい。
・画像中の顕著性領域とテキストとの配置位置の関係性が考慮されている(オブジェクトの顕著性領域からテキストへの自然な視線の移動が行える方向及び距離にテキストが配置されている)。
・テキストの全体または一部が顕著性領域の上に重なっていない。
・顕著性領域のうち特に顕著性が特に高い部分の近くに(あるいはその部分から離れた位置に)テキストが配置されている。
The arrangement position determination model is, for example, an image to which text is added, and an image that is recognized to have a good relationship between the saliency region in the image and the arrangement position of the text as training data, and an arbitrary learner can be generated by machine learning using A learner that generates an arrangement position determination model extracts saliency regions and text from images of training data and learns their relative positional relationships in the images. Images used as training data are preferably selected based on, for example, the following points of view as conditions for the relationship between the saliency regions in the image and the placement positions of the text.
・The relationship between the saliency regions in the image and the placement position of the text is taken into consideration (the text is placed in a direction and distance that allows natural movement of the line of sight from the saliency region of the object to the text).
• All or part of the text does not overlap the saliency region.
- The text is placed near (or away from) a portion of the saliency region that has particularly high salience.
 一例として、配置位置特定部184は、各カットに設定された画像データに対して第2のデータフィールド31を配置する推薦領域を上記の配置位置判定モデルを用いて算出する。上記算出は、例えば画像の各ピクセルについて、第2のデータフィールド31の配置位置の推薦度合いをスコアリングすることで実行してもよい。図14及び図16は、それぞれ、配置位置特定部184によって算出された第2のデータフィールド配置推薦領域を示す図である。図14は、図9に示した画像中の顕著性物体である大きな山々の右上に第2のデータフィールド配置領域が示現している例を示しており、図14に示す例では、第2のデータフィールド配置推薦領域のうち高い濃度で示されている部分のスコアリングが高い。また図16は、図11に示した画像中の顕著性物体である画像中の動物の右側に第2のデータフィールド配置推薦領域が示現している例を示しており、図16に示す例では、第2のデータフィールド配置推薦領域のうち低い濃度で示されている部分のスコアリングが高い。なお、図14及び図16には配置位置特定部184により特定された第2のデータフィールド配置推薦領域が説明及び可視化のため示されているが、配置位置特定部184による実際の処理においては第2のデータフィールド配置推薦領域の表示は必須ではない。 As an example, the layout position specifying unit 184 calculates a recommended area for arranging the second data field 31 for the image data set for each cut using the layout position determination model described above. The above calculation may be performed, for example, by scoring the degree of recommendation of the placement position of the second data field 31 for each pixel of the image. 14 and 16 are diagrams showing the second recommended data field placement areas calculated by the placement position specifying unit 184. FIG. FIG. 14 shows an example in which the second data field placement area is shown to the upper right of the large mountains that are the salient objects in the image shown in FIG. 9. In the example shown in FIG. A high score is given to a portion of the data field placement recommendation area that is shown in high density. Also, FIG. 16 shows an example in which the second data field placement recommendation area is displayed on the right side of the animal in the image, which is the saliency object in the image shown in FIG. , the portion of the second data field placement recommendation area shown in low density has a high score. 14 and 16 show the second recommended data field placement area specified by the placement position specifying unit 184 for explanation and visualization. Display of the data field placement recommendation area of 2 is not essential.
 配置位置特定部184は、上記のようにして求められた第2のデータフィールド配置領域のうち、スコアリングがより高い部分を第2のデータフィールド31の配置位置として特定する。図15は、図14に示す例における第2のデータフィールド配置領域のうち高い濃度で示されたスコアリングが高い部分に第2のデータフィールド31が配置された状態を示し、図17は、図16に示す例における第2のデータフィールド配置領域のうち低い濃度で示されたスコアリングが高い部分に第2のデータフィールド31が配置された状態を示している。 The placement position specifying unit 184 specifies, as the placement position of the second data field 31, a portion with a higher score among the second data field placement regions obtained as described above. FIG. 15 shows a state in which the second data field 31 is arranged in a high-scoring part indicated by high density in the second data field arrangement area in the example shown in FIG. 16 shows a state in which the second data field 31 is arranged in a portion of the second data field arrangement area in the example shown in FIG.
 このように、第2のデータ配置位置決定部180によれば、顕著性情報に基づいて画像中の顕著性領域が判別され、第2のデータフィールド31の配置位置が顕著性領域との位置の関係性を考慮して特定されるので、サーバ1は、各カット上において第2のデータフィールド31が画像中の顕著性領域に対して適切な位置に配置された複合コンテンツデータを簡便に作成することが可能となる。 As described above, according to the second data arrangement position determination unit 180, the saliency region in the image is determined based on the saliency information, and the arrangement position of the second data field 31 is positioned at the same position as the saliency region. Since the relationship is specified in consideration of the relationship, the server 1 easily creates composite content data in which the second data field 31 is arranged at an appropriate position with respect to the saliency region in the image on each cut. becomes possible.
 なお、本例では第2のデータ配置位置決定部180の配置位置特定部184が配置位置判定モデルを用いて第2のデータフィールド31の配置位置を特定するが、例えば顕著性領域及び/又は第2のデータフィールド31の占有領域の位置や大きさによっては、第2のデータフィールド31の一部が画像中の顕著性領域に重なる位置が第2のデータフィールド31の配置位置として特定される可能性がある。そのため第2のデータ配置位置決定部180は、特定された第2のデータフィールド31の配置位置が第2のデータフィールド31の一部が画像中の顕著性領域に重なる位置かどうかを判定し、重なると判定した場合には、第2のデータフィールド31が顕著性領域に重ならない位置までずらした位置を第2のデータフィールド31の配置位置として再度特定するように構成されていることが好ましい。第2のデータ配置位置決定部180が、第2のデータフィールド31の一部が画像中の顕著性領域に重なると判定した場合、第2のデータ配置位置決定部180は例えば、第2のデータフィールド31を最初に特定した配置位置から最短の変位距離で上記重なりを解消できる位置を特定し、第2のデータフィールド31の配置位置をその位置へ修正する。 In this example, the placement position specifying unit 184 of the second data placement position determining unit 180 specifies the placement position of the second data field 31 using the placement position determination model. Depending on the position and size of the occupied area of the second data field 31, the position where a part of the second data field 31 overlaps the saliency area in the image may be specified as the arrangement position of the second data field 31. have a nature. Therefore, the second data placement position determining unit 180 determines whether the specified placement position of the second data field 31 is a position where a part of the second data field 31 overlaps the saliency region in the image, When it is determined that the second data field 31 overlaps the saliency region, it is preferable that the position shifted to the position where the second data field 31 does not overlap the saliency region is specified again as the placement position of the second data field 31 . When the second data placement position determination unit 180 determines that a portion of the second data field 31 overlaps the saliency region in the image, the second data placement position determination unit 180 may, for example, place the second data A position where the overlap can be eliminated with the shortest displacement distance from the first specified placement position of the field 31 is specified, and the placement position of the second data field 31 is corrected to that position.
 ところで、図11を参照して説明したように顕著性物体検出を用いた場合には、検出される顕著性物体の全体のおおよその形状は認識可能となるものの、顕著性物体内部における顕著性の強度分布は取得できないので、第2のデータ配置位置決定部180は、顕著性物体のより顕著性が高い部分から離れた位置を第2のデータフィールド31の配置位置として特定することが生じうる。一方、図13を参照して説明したように顕著性マップ検出を用いた場合には、検出される顕著性物体の全体像が不明瞭であるため、第2のデータ配置位置決定部180が特定した第2のデータフィールド31の配置位置に実際に第2のデータフィールド31を配置した場合に、第2のデータフィールド31の一部が顕著性領域に重なってしまうことが生じうる。 By the way, when saliency object detection is used as described with reference to FIG. Since the intensity distribution cannot be obtained, the second data placement position determining unit 180 may specify a position away from the more salient portion of the saliency object as the placement position of the second data field 31 . On the other hand, when the saliency map detection is used as described with reference to FIG. 13, the overall image of the detected saliency object is unclear. When the second data field 31 is actually arranged at the position where the second data field 31 is arranged, a part of the second data field 31 may overlap the saliency area.
 そこで、図18に示されるように、顕著性物体検出と顕著性マップ検出を組み合わせたハイブリッド顕著性マップ検出モデルを用いて顕著性情報を取得することにより、視覚的に注目される顕著性領域の輪郭と、その顕著性領域の中でより顕著性が高い箇所との双方の情報を捉えることが可能となる。これにより、第2のデータ配置位置決定部180の配置位置特定部184は、検出した顕著性領域に対してより適切な位置に第2のデータフィールド31を配置することができる。 Therefore, as shown in FIG. 18, by acquiring saliency information using a hybrid saliency map detection model that combines saliency object detection and saliency map detection, the saliency region of the visually focused It is possible to capture information on both the contour and the more salience points within the saliency region. Thereby, the placement position specifying unit 184 of the second data placement position determining unit 180 can place the second data field 31 at a more appropriate position with respect to the detected saliency region.
 さらに、顕著性の検出の精度は画質の影響を受けることを鑑み、例えば既知の解像度アップコンバージョン技術及び/又はHDR(ハイダイナミックレンジ)変換技術を組み合わせることで、先に画像の解像度及び/又はダイナミックレンジを上げてから、その画像中の顕著性領域の検出を行うことで、顕著性領域を検出する精度をより高めることができる。 Furthermore, considering that the accuracy of saliency detection is affected by image quality, for example, by combining known resolution up-conversion technology and/or HDR (high dynamic range) conversion technology, image resolution and/or dynamic By increasing the range and then detecting the saliency region in the image, the accuracy of detecting the saliency region can be further increased.
[変形例]
 上記の実施形態では、配置位置特定部184が第2のデータフィールド31を配置する推薦領域を算出する一例として、画像の各ピクセルについて第2のデータフィールド31の配置位置の推薦度合いをスコアリングする例を挙げて説明した。この場合には、各ピクセルについてスコアリング値が得られるため第2のデータフィールド31の配置位置の推薦度合いについてきめ細かい情報が得られる。しかしその一方で、多数のピクセルのうちの各ピクセルに付与されるスコアリング値は極めて小さくなるため、それらの分布によって特定される第2のデータフィールド31の配置推薦領域は均一化された比較的広い領域になり得る。
[Modification]
In the above-described embodiment, as an example of calculating the recommended area for arranging the second data field 31 by the arrangement position specifying unit 184, the degree of recommendation of the arrangement position of the second data field 31 is scored for each pixel of the image. I explained with an example. In this case, since a scoring value is obtained for each pixel, detailed information about the degree of recommendation of the arrangement position of the second data field 31 can be obtained. On the other hand, however, since the scoring value assigned to each pixel out of a large number of pixels is extremely small, the placement recommendation region of the second data field 31 specified by their distribution is uniformed and relatively small. It can be a large area.
 これに対し本変形例では、配置位置特定部184が、各カットに設定された画像データに対して第2のデータフィールド31を配置する推薦領域を上記の配置位置判定モデルを用いて算出する際、対象の画像を予め特定数のセルに区切り、各セルについて第2のデータフィールド31の配置位置の推薦度合いをスコアリングし、スコアリング値の最も高いセルの位置を第2のデータフィールド31の配置位置とする。なお、本変形例を実施するサーバ1のその他の構成の機能及び動作は図2及び図4等を参照して説明した通りであるので、ここではそれらについての説明は省略する。 On the other hand, in this modification, when the placement position specifying unit 184 calculates the recommended area in which the second data field 31 is placed for the image data set for each cut using the above placement position determination model, , divide the target image into a specific number of cells in advance, score the degree of recommendation of the placement position of the second data field 31 for each cell, and place the position of the cell with the highest scoring value in the second data field 31 Placement position. Note that functions and operations of other configurations of the server 1 that implements this modification are as described with reference to FIGS.
 図19は、配置位置特定部184による、素材コンテンツデータ(画像)が挿入されたカット上における第2のデータフィールド31の配置位置を特定する変形例を示す図である。 FIG. 19 is a diagram showing a modification in which the placement position specifying unit 184 specifies the placement position of the second data field 31 on the cut into which the material content data (image) is inserted.
 本例では、配置位置特定部184は、素材コンテンツデータ(画像)が挿入されたカット上の画像全体を複数のセルに区切り、それらの各々のセルについて配置位置判定モデルを用いて第2のデータフィールド31の配置位置の推薦度合いに関するスコアリング値を算出し、スコアリング値が最も高いセルを第2のデータフィールド31を配置すべきセルとして特定する。図19では画像全体が縦3横6の18セルに区切られた例を示しているが、画像全体を区切るセルの数は任意に設定してよい。また、図19ではセルを区切る境界線を鎖線で示しているが、これは説明のために示したものであり、配置位置特定部184による処理においてはこのような境界線は実際には描画されない。 In this example, the placement position specifying unit 184 divides the entire image on the cut in which the material content data (image) is inserted into a plurality of cells, and uses the placement position determination model for each cell to obtain the second data. A scoring value relating to the degree of recommendation of the placement position of the field 31 is calculated, and the cell with the highest scoring value is specified as the cell in which the second data field 31 should be placed. Although FIG. 19 shows an example in which the entire image is partitioned into 18 cells of 3 vertically and 6 horizontally, the number of cells partitioning the entire image may be set arbitrarily. Also, in FIG. 19, the boundary lines separating the cells are indicated by dashed lines, but this is shown for the sake of explanation, and such boundary lines are not actually drawn in the processing by the placement position specifying unit 184. .
 次に、本例の配置位置特定部184による、第2のデータフィールド31の配置位置特定処理について説明する。 Next, the placement position specifying processing of the second data field 31 by the placement position specifying unit 184 of this example will be described.
 最初に、配置位置特定部184は素材コンテンツデータ(画像)が挿入されたカットの画像全体を所定数のセルに区切る(図19(a)参照)。一例として本例では縦3横6の18セルに区切る例を示しているが、画像全体を区切るセルの数は任意に設定することができる。 First, the placement position specifying unit 184 divides the entire cut image into which the material content data (image) is inserted into a predetermined number of cells (see FIG. 19(a)). As an example, in this example, an example is shown in which an image is divided into 18 cells of 3 vertically and 6 horizontally, but the number of cells dividing the entire image can be set arbitrarily.
 次に、配置位置特定部184は、各カットに設定された画像データに対して第2のデータフィールド31を配置する推薦領域を上記の配置位置判定モデルを用いて算出する(図19(b)参照)。上記算出は、図19(a)に示すように区切られた画像の各セルについて、第2のデータフィールド31の配置位置の推薦度合いをスコアリングすることで実行される。図19(b)は、各セルについて算出されたスコアリング値を示している。なお、図19(b)には各セルのスコアリング値が説明及び可視化のため示されているが、配置位置特定部184による実際の処理においてはこれらのスコアリング値の表示は必須ではない。 Next, the layout position specifying unit 184 calculates a recommended area for arranging the second data field 31 for the image data set for each cut using the layout position determination model (FIG. 19B). reference). The above calculation is performed by scoring the degree of recommendation of the placement position of the second data field 31 for each cell of the image divided as shown in FIG. 19(a). FIG. 19(b) shows the scoring values calculated for each cell. Although the scoring values of each cell are shown in FIG. 19(b) for explanation and visualization, the display of these scoring values is not essential in the actual processing by the arrangement position specifying unit 184. FIG.
 次に、配置位置特定部184は、これらのセルのうち、算出されたスコアリング値が最も高いセルを第2のデータフィールド31の配置位置として特定する(図19(c)参照)。図19(b)に示す例では、一番上の行の右から2列目のセルのスコアリング値が0.34で最も高いので、そのセルが第2のデータフィールド31の配置位置として特定される。なお、図19(c)中、第2のデータフィールド31の配置位置として特定されたセルは説明及び可視化のため網掛けを付して示しているが、配置位置特定部184による実際の処理においては網掛けの生成や表示はなされない。 Next, the placement position specifying unit 184 specifies the cell with the highest calculated scoring value among these cells as the placement position of the second data field 31 (see FIG. 19(c)). In the example shown in FIG. 19(b), the cell in the second column from the right in the top row has the highest scoring value of 0.34, so that cell is identified as the placement position of the second data field 31. be done. In FIG. 19(c), the cells specified as the placement position of the second data field 31 are shaded for explanation and visualization. does not generate or display shading.
 このように、本例によれば、カットの画像全体が所定数のセルに区切られ、各セルについて配置位置特定部184により第2のデータフィールド31の配置位置の推薦度合いに関するスコアリング値が算出され、それらのセルのうちスコアリング値が最も高いセルが第2のデータフィールド31の配置位置として特定される。本例の方法では、上述したように画像の各ピクセルについてスコアリング値を算出して第2のデータフィールド31の配置位置を特定する方法に比べてスコアリング値を算出する対象数が少ないので、各セル間のスコアリング値の差が明確になり、その結果、スコアリング値が最も高いセルを第2のデータフィールド31の配置位置として最適な位置として一義的に特定しやすくなる。 Thus, according to this example, the entire cut image is divided into a predetermined number of cells, and the placement position specifying unit 184 calculates the scoring value regarding the degree of recommendation of the placement position of the second data field 31 for each cell. The cell with the highest scoring value among those cells is specified as the arrangement position of the second data field 31 . In the method of this example, compared to the method of calculating the scoring value for each pixel of the image and specifying the arrangement position of the second data field 31 as described above, the number of targets for which the scoring value is calculated is small. The difference in scoring value between cells becomes clear, and as a result, it becomes easier to uniquely identify the cell with the highest scoring value as the optimal position as the placement position of the second data field 31 .
 図20は、図19に示す方法によって特定されたセルの位置に第2のデータフィールド31を配置する種々の配置例を示す図である。 FIG. 20 is a diagram showing various placement examples for placing the second data field 31 at the position of the cell specified by the method shown in FIG.
 カットの画像全体を区切るセルの数によっては、1つのセルの大きさが第2のデータフィールド31を完全に収容しきれない大きさとなることがある。したがって、配置位置特定部184において予めなされた設定に基づいて、特定されたセルに対する第2のデータフィールド31の配置がなされるように構成されていることが好ましい。 Depending on the number of cells dividing the entire image of the cut, the size of one cell may not be able to completely accommodate the second data field 31 . Therefore, it is preferable that the second data field 31 is arranged in the specified cell based on the settings made in advance in the placement position specifying section 184 .
 図20(a)は、特定されたセルの中央に第2のデータフィールド31を配置するように配置位置特定部184が設定されている例を示す。この例では、配置位置特定部184は特定されたセルの中心と第2のデータフィールド31とが概ね一致するように、特定されたセルの中央に第2のデータフィールド31を配置する。その結果、図20(a)に示す例では、第2のデータフィールド31の左右の部分がそれぞれ左右に隣接するセル内に延出している。 FIG. 20(a) shows an example in which the placement position specifying unit 184 is set so as to place the second data field 31 in the center of the specified cell. In this example, the placement position specifying unit 184 places the second data field 31 in the center of the specified cell so that the center of the specified cell and the second data field 31 approximately match. As a result, in the example shown in FIG. 20(a), the left and right portions of the second data field 31 extend into cells adjacent to each other on the left and right.
 図20(b)は、特定されたセルに対して、第2のデータフィールド31を画像内の顕著性領域(図20に示す例では大きな山々)からできるだけ離れた位置に配置するように配置位置特定部184が設定されている例を示す。この例では、配置位置特定部184は、第2のデータフィールド31が特定されたセルの上縁に沿い、かつ第2のデータフィールド31の右側の部分が右に隣接するセル内に延出するように、その特定されたセルに第2のデータフィールド31を配置する。 FIG. 20(b) shows, for the identified cell, the placement position so that the second data field 31 is placed as far away as possible from the saliency region in the image (large mountains in the example shown in FIG. 20). An example in which the specifying part 184 is set is shown. In this example, the placement location identifier 184 causes the second data field 31 to extend along the upper edge of the identified cell and the right portion of the second data field 31 into the adjacent cell to the right. A second data field 31 is placed in the identified cell as follows.
 図20(c)は、特定されたセルに対して、第2のデータフィールド31を画像内の顕著性領域(大きな山々)にできるだけ近い位置に配置するように配置位置特定部184が設定されている例を示す。この例では、配置位置特定部184は、第2のデータフィールド31が特定されたセルの下縁に沿い、かつ第2のデータフィールド31の右側の部分が左に隣接するセル内に延出するように、その特定されたセルに第2のデータフィールド31を配置する。 In FIG. 20(c), the placement position specifying unit 184 is set so as to place the second data field 31 at a position as close as possible to the saliency region (large mountains) in the image for the specified cell. Here are some examples: In this example, placement location identifier 184 extends second data field 31 along the bottom edge of the identified cell and the right portion of second data field 31 extends into the adjacent cell to the left. A second data field 31 is placed in the identified cell as follows.
 配置位置特定部184の上述した種々の設定は、例えばユーザ端末3より通信部110を介して受け付けるユーザ入力に応じてサーバ1において適宜変更することが可能である。 The above-described various settings of the placement position specifying unit 184 can be appropriately changed in the server 1 according to user input received from the user terminal 3 via the communication unit 110, for example.
 なお、図20(a)~(c)を参照して説明したいずれの場合であっても、特定されたセルに配置された第2のデータフィールド31は、カットの画像全体の領域外にはみ出したり、顕著性領域に重なったりしてはならないが、上述した実施形態においても説明したように、例えば顕著性領域及び/又は第2のデータフィールド31の占有領域の位置や大きさによっては、図20(a)~(c)を参照して説明したようにして配置された第2のデータフィールド31がこれらの制約を満たさない場合が生じうる、そのため本例においては、第2のデータ配置位置決定部180は、配置された第2のデータフィールド31がカットの画像全体の領域外にはみ出したり、顕著性領域に重なったりしているかどうかを判定し、それらの少なくとも1つに該当すると判定した場合には、第2のデータフィールド31が上記制約を満たす位置(第2のデータフィールド31がカットの画像全体の領域外にはみ出したり、顕著性領域に重なったりしない位置)までずらした位置を第2のデータフィールド31の配置位置として再度特定するように構成されていることが好ましい。 In any of the cases described with reference to FIGS. 20A to 20C, the second data field 31 placed in the specified cell protrudes outside the area of the entire image of the cut. However, as described in the above-described embodiments, depending on the position and size of the saliency region and/or the occupied region of the second data field 31, for example, the 20(a)-(c), the second data field 31 placed as described with reference to 20(a)-(c) does not meet these constraints, so in this example the second data placement position The determining unit 180 determines whether the arranged second data field 31 protrudes outside the area of the entire image of the cut or overlaps the saliency area, and determines that at least one of them applies. In this case, the position shifted to the position where the second data field 31 satisfies the above constraints (the position where the second data field 31 does not protrude outside the region of the entire image of the cut or does not overlap the saliency region) is shifted to the first position. 2 data field 31 is preferably configured to be specified again.
<他の実施例>
 上記の実施形態では、配置位置特定部184が、画像中の顕著性領域と第2のデータフィールド31との配置位置の関係性を機械学習させて得られる学習済みモデルである配置位置判定モデルを用いて、素材コンテンツデータ(画像)が挿入されたカット上における第2のデータフィールド31の配置位置を特定する例について説明した。この例によれば、顕著性領域と第2のデータフィールド31との配置位置の関係性を考慮した適切な位置に第2のデータフィールド31を配置できる。しかしその一方で、例えば画像データ中のピクセル毎にスコアリングを行う場合には特に多くの計算コストを要する。
<Other Examples>
In the above embodiment, the placement position identifying unit 184 determines the placement position determination model, which is a learned model obtained by machine learning the relationship between the placement positions of the saliency regions in the image and the second data field 31. An example of specifying the arrangement position of the second data field 31 on the cut into which the material content data (image) is inserted has been described. According to this example, the second data field 31 can be arranged at an appropriate position considering the positional relationship between the saliency region and the second data field 31 . However, on the other hand, for example, scoring for each pixel in image data requires a particularly large computational cost.
 図21は、配置位置特定部184による、素材コンテンツデータ(画像)が挿入されたカット上における第2のデータフィールド31の配置位置を特定する他の例を示す図である。 FIG. 21 is a diagram showing another example of specifying the placement position of the second data field 31 on the cut into which the material content data (image) is inserted by the placement position specifying unit 184. As shown in FIG.
 本例では、配置位置特定部184は、素材コンテンツデータ(画像)が挿入されたカット上の画像全体を複数のセルに区切り、第2のデータフィールド31を配置すべき最適なセルを特定する。図21では画像全体が縦3横3の9セルに区切られた例を示しているが、画像全体を区切るセルの数は任意に設定してよい。また、図21ではセルを区切る境界線を鎖線で示しているが、これは説明のために示したものであり、配置位置特定部184による処理においてはこのような境界線は実際には描画されない。なお、本実施例を実施するサーバ1のその他の構成の機能及び動作は図2及び図4等を参照して説明した通りであるので、ここではそれらについての説明は省略する。 In this example, the placement position specifying unit 184 divides the entire cut image into which the material content data (image) is inserted into a plurality of cells, and specifies the optimum cell in which the second data field 31 should be placed. Although FIG. 21 shows an example in which the entire image is partitioned into 9 cells, 3 vertically and 3 horizontally, the number of cells partitioning the entire image may be set arbitrarily. Also, in FIG. 21, the boundary lines separating the cells are indicated by dashed lines, but this is shown for the sake of explanation, and such boundary lines are not actually drawn in the processing by the placement position specifying unit 184. . Note that the functions and operations of other components of the server 1 that implements this embodiment are as described with reference to FIGS.
 次に、本例の配置位置特定部184による、第2のデータフィールド31の配置位置特定処理について説明する。 Next, the placement position specifying processing of the second data field 31 by the placement position specifying unit 184 of this example will be described.
 最初に、配置位置特定部184は素材コンテンツデータ(画像)が挿入されたカットの画像全体を所定数のセルに区切る(図21(a)参照)。次に、配置位置特定部184はそれらのセルのうち、顕著性領域判別部182により特定されている顕著性領域が一部でも含まれるセルを第2のデータフィールド31の配置位置対象から除外する(図21(b)参照)。図21(b)では、説明及び可視化のため配置位置対象から除外されたセルをX印で示しているが、配置位置特定部184による実際の処理においてはX印の生成や表示はなされない。最後に、配置位置特定部184は、配置位置対象から除外されずに残ったセルのうち、画像中の顕著性領域に最も近いセルを、第2のデータフィールド31の配置位置として特定する(図21(c)参照)。図21(c)中、第2のデータフィールド31の配置位置として特定されたセルは説明及び可視化のため網掛けを付して示しているが、配置位置特定部184による実際の処理においては網掛けの生成や表示はなされない。 First, the placement position specifying unit 184 divides the entire cut image into which the material content data (image) is inserted into a predetermined number of cells (see FIG. 21(a)). Next, the placement position specifying unit 184 excludes cells including at least part of the saliency regions specified by the saliency region discrimination unit 182 from among those cells from the placement position object of the second data field 31. (See FIG. 21(b)). In FIG. 21(b), the cells excluded from the placement position target are indicated by X marks for explanation and visualization, but the X marks are not generated or displayed in the actual processing by the placement position specifying unit 184. Finally, the placement position specifying unit 184 specifies the cell closest to the saliency region in the image, among the remaining cells that have not been excluded from the placement position target, as the placement position of the second data field 31 (Fig. 21(c)). In FIG. 21(c), the cells specified as the placement position of the second data field 31 are shaded for explanation and visualization. No odds are generated or displayed.
 このように本例によれば、カットの画像全体を所定数のセルに区切ることで生成される第2のデータフィールド31の配置位置の複数の選択肢の中から、所定の条件の下で最適と判断されるセルが第2のデータフィールド31の配置位置として特定される。本例の方法では、上述した配置位置判定モデルを用いて第2のデータフィールド31の配置位置を特定する方法に比べて、顕著性領域に対する配置位置の適切度はいくらか低下し得るものの、より少ない計算コストで実施できる利点がある。 As described above, according to this example, the most suitable position under a predetermined condition is selected from a plurality of options for the placement position of the second data field 31 generated by dividing the entire cut image into a predetermined number of cells. The determined cell is identified as the placement position of the second data field 31 . In the method of this example, compared to the method of specifying the placement position of the second data field 31 using the placement position determination model described above, although the appropriateness of the placement position for the saliency region may be somewhat reduced, it is less It has the advantage of being computationally costly.
 なお、本例においては第2のデータフィールド31の配置位置を選択する条件として「顕著性領域に最も近い」ことを例に挙げて説明したが、配置位置の選択条件はこれに限られない。例えば、顕著性領域から所定の距離を置いた位置あるいは最も離れた位置であること、顕著性領域から所定の方向にある位置であること等を条件としてもよく、さらにはそれらの条件を互いに齟齬が生じない限りにおいて任意に組み合わせてもよい。 In this example, the condition for selecting the placement position of the second data field 31 is "closest to the saliency region", but the selection condition for the placement position is not limited to this. For example, the condition may be a position at a predetermined distance from the saliency region, a position farthest from the saliency region, a position in a predetermined direction from the saliency region, or the like. may be combined arbitrarily as long as no
 以上に説明した実施形態例の本システムによれば、編集用ソフト、サーバ、専門技術を持った編集者などを自前で揃えなくとも、簡単に複合コンテンツデータを作成することが可能となる。例えば、下記のような場面での活用が想定される。
 1)ECショップで販売している商品情報の動画化
 2)プレスリリース情報、CSR情報などを動画で配信
 3)利用方法・オペレーションフローなどのマニュアルを動画化
 4)動画広告として活用できるクリエイティブを制作
According to the present system of the embodiment described above, it is possible to easily create composite content data without preparing editing software, servers, editors with specialized skills, and the like. For example, it is expected to be used in the following situations.
1) Animating product information sold at EC shops 2) Distributing press release information, CSR information, etc. as videos 3) Animating manuals such as usage and operation flow 4) Creating creatives that can be used as video advertisements
 以上、本発明の好ましい実施形態例について説明したが、本発明の技術的範囲は上記実施形態の記載に限定されるものではない。上記実施形態例には様々な変更・改良を加えることが可能であり、そのような変更または改良を加えた形態のものも本発明の技術的範囲に含まれる。 Although the preferred embodiments of the present invention have been described above, the technical scope of the present invention is not limited to the description of the above embodiments. Various modifications and improvements can be added to the above-described embodiment examples, and forms with such modifications and improvements are also included in the technical scope of the present invention.
1 サーバ
2 管理者端末
3 ユーザ端末

 
1 Server 2 Administrator terminal 3 User terminal

Claims (12)

  1.  カットに対して画像データを設定する素材コンテンツデータ設定部と、
     前記カット上に配置するテキストフィールドの位置を前記画像データにおける顕著性領域を参照して決定するテキストフィールド配置位置決定部と、
     を備えるサーバ。
    a material content data setting unit for setting image data for a cut;
    a text field placement position determining unit that determines the position of the text field to be placed on the cut by referring to the saliency region in the image data;
    A server with
  2.  前記テキストフィールド配置位置決定部は、
     前記画像データ中に含まれる顕著性領域を判別する顕著性領域判別部と、
     前記顕著性領域を参照して、前記カット上の前記テキストフィールドの配置位置を特定する配置位置特定部と、
     を有する、請求項1に記載のサーバ。
    The text field placement position determination unit
    a saliency region discriminating unit that discriminates a saliency region included in the image data;
    an arrangement position identifying unit that identifies the arrangement position of the text field on the cut by referring to the saliency region;
    2. The server of claim 1, comprising:
  3.  前記配置位置特定部は、画像中の顕著性領域とテキストフィールドとの関係性が所定の条件を満たす画像データを教師データとして学習して生成されたモデルに基づいて、前記カット上の前記テキストフィールドの配置位置を特定するように構成されている、
     請求項2に記載のサーバ。
    The placement position specifying unit is configured to determine the text field on the cut based on a model generated by learning image data that satisfies a predetermined condition for the relationship between the saliency region in the image and the text field as teacher data. is configured to identify the location of the
    3. A server according to claim 2.
  4.  前記配置位置特定部は、前記カット上の前記テキストフィールドの配置位置を、前記画像データの各ピクセルについて算出したスコアリング値に基づいて特定するように構成されている、
     請求項3に記載のサーバ。
    The arrangement position identifying unit is configured to identify the arrangement position of the text field on the cut based on a scoring value calculated for each pixel of the image data.
    4. A server according to claim 3.
  5.  前記配置位置特定部は、前記カット上の前記テキストフィールドの配置位置を、前記画像データを区切る複数のセルの各セルについて算出したスコアリング値に基づいて特定するように構成されている、
     請求項3に記載のサーバ。
    The placement position specifying unit is configured to specify the placement position of the text field on the cut based on a scoring value calculated for each of a plurality of cells that divide the image data.
    4. A server according to claim 3.
  6.  前記配置位置特定部は、
     前記画像データが設定された前記カットの全体画像を複数のセルに区切ることと、
     前記複数のセルのうち、前記顕著性領域の一部が少なくとも含まれるセルを除外することと、
     前記除外後に残る複数の前記セルのうち、前記顕著性領域との関係に関する所定の条件を満たすセルを前記テキストフィールドの配置位置として特定することと、
    を実行するように構成されている、
     請求項2に記載のサーバ。
    The arrangement position specifying unit
    dividing the entire image of the cut in which the image data is set into a plurality of cells;
    excluding cells that include at least part of the saliency region among the plurality of cells;
    identifying, among the plurality of cells remaining after the exclusion, a cell that satisfies a predetermined condition regarding a relationship with the saliency region as the placement position of the text field;
    is configured to run
    3. A server according to claim 2.
  7.  前記顕著性領域は、顕著性物体検出及び顕著性マップ検出を用いたハイブリッド顕著性マップ検出により検出される、
     請求項1から6のいずれか1項に記載のサーバ。
    the saliency regions are detected by hybrid saliency map detection using saliency object detection and saliency map detection;
    A server according to any one of claims 1-6.
  8.  前記顕著性領域は、顕著性マップ検出により検出される、
     請求項1から6のいずれか1項に記載のサーバ。
    the saliency regions are detected by saliency map detection;
    A server according to any one of claims 1-6.
  9.  前記顕著性領域は、顕著性物体検出により検出される、
     請求項1から6のいずれか1項に記載のサーバ。
    the saliency regions are detected by saliency object detection;
    A server according to any one of claims 1-6.
  10.  請求項1から9のいずれか1項に記載のサーバを備えたシステム。 A system comprising the server according to any one of claims 1 to 9.
  11.  コンピュータによって実行されるテキストフィールド配置位置方法であって、
     カットに対して画像データを設定するステップと、
     前記カット上に配置するテキストフィールドの位置を前記画像データにおける顕著性領域に基づいて決定するステップと、
     を含む、テキストフィールド配置位置方法。
    A computer-implemented text field placement method comprising:
    setting image data for the cut;
    determining the location of a text field to place on the cut based on saliency regions in the image data;
    Text field placement position method, including.
  12.  テキストフィールド配置位置方法をコンピュータに実行させるプログラムであって、
     前記テキストフィールド配置位置方法は、
     カットに対して画像データを設定するステップと、
     前記カット上に配置するテキストフィールドの位置を前記画像データにおける顕著性領域に基づいて決定するステップと、
     を含む、プログラム。
     

     
    A program for causing a computer to execute a text field placement position method,
    The text field placement position method is
    setting image data for the cut;
    determining the location of a text field to place on the cut based on saliency regions in the image data;
    program, including


PCT/JP2021/011672 2021-03-22 2021-03-22 Server, text field arrangement position method, and program WO2022201237A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/011672 WO2022201237A1 (en) 2021-03-22 2021-03-22 Server, text field arrangement position method, and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/011672 WO2022201237A1 (en) 2021-03-22 2021-03-22 Server, text field arrangement position method, and program

Publications (1)

Publication Number Publication Date
WO2022201237A1 true WO2022201237A1 (en) 2022-09-29

Family

ID=83395279

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/011672 WO2022201237A1 (en) 2021-03-22 2021-03-22 Server, text field arrangement position method, and program

Country Status (1)

Country Link
WO (1) WO2022201237A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2014038601A (en) * 2012-08-16 2014-02-27 Naver Corp Automatic image editing device by image analysis, method and computer readable recording medium
WO2016053820A1 (en) * 2014-09-30 2016-04-07 Microsoft Technology Licensing, Llc Optimizing the legibility of displayed text
US20200310631A1 (en) * 2017-11-20 2020-10-01 Huawei Technologies Co., Ltd. Method and Apparatus for Dynamically Displaying Icon Based on Background Image

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2014038601A (en) * 2012-08-16 2014-02-27 Naver Corp Automatic image editing device by image analysis, method and computer readable recording medium
WO2016053820A1 (en) * 2014-09-30 2016-04-07 Microsoft Technology Licensing, Llc Optimizing the legibility of displayed text
US20200310631A1 (en) * 2017-11-20 2020-10-01 Huawei Technologies Co., Ltd. Method and Apparatus for Dynamically Displaying Icon Based on Background Image

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LU PENG; ZHANG HAO; PENG XUJUN; JIN XIAOFU: "Learning the Relation Between Interested Objects and Aesthetic Region for Image Cropping", IEEE TRANSACTIONS ON MULTIMEDIA, IEEE, USA, vol. 23, 9 October 2020 (2020-10-09), USA, pages 3618 - 3630, XP011884057, ISSN: 1520-9210, DOI: 10.1109/TMM.2020.3029882 *

Similar Documents

Publication Publication Date Title
CN109618222B (en) A kind of splicing video generation method, device, terminal device and storage medium
US10387776B2 (en) Recurrent neural network architectures which provide text describing images
EP3520081B1 (en) Techniques for incorporating a text-containing image into a digital image
CN109803180B (en) Video preview generation method and device, computer equipment and storage medium
CN110134931B (en) Medium title generation method, medium title generation device, electronic equipment and readable medium
US7945142B2 (en) Audio/visual editing tool
US20180137681A1 (en) Methods and systems for generating virtual reality environments from electronic documents
CN110914872A (en) Navigating video scenes with cognitive insights
CN111373740B (en) Method for converting horizontal video into vertical movement layout by using selection interface
CN110795925B (en) Image-text typesetting method and device based on artificial intelligence and electronic equipment
CN111460183A (en) Multimedia file generation method and device, storage medium and electronic equipment
CN117668402A (en) System and method for applying layout to document
CN105874449A (en) Systems and methods for extracting and generating images for display content
US20230027412A1 (en) Method and apparatus for recognizing subtitle region, device, and storage medium
Chu et al. Optimized comics-based storytelling for temporal image sequences
CN112287168A (en) Method and apparatus for generating video
US20230115551A1 (en) Localization of narrations in image data
WO2019245033A1 (en) Moving image editing server and program
US11533427B2 (en) Multimedia quality evaluation
CN111881900B (en) Corpus generation method, corpus translation model training method, corpus translation model translation method, corpus translation device, corpus translation equipment and corpus translation medium
WO2022201237A1 (en) Server, text field arrangement position method, and program
KR20210003547A (en) Method, apparatus and program for generating website automatically using gan
KR102553332B1 (en) Method and apparatus for editing content on a live broadcasting platform
CN110489933B (en) Method and system for generating planar design framework
WO2022201515A1 (en) Server, animation recommendation system, animation recommendation method, and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21932851

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21932851

Country of ref document: EP

Kind code of ref document: A1