JP7841830B2

JP7841830B2 - Modifying digital scripts using generative adversarial networks

Info

Publication number: JP7841830B2
Application number: JP2024509470A
Authority: JP
Inventors: ラクシット、サルバジット、ケー; サンタル、サティヤ; ジャワハルラール、サミュエル、マシュー; カンナン、スリデヴィ
Original assignee: International Business Machines Corp
Current assignee: International Business Machines Corp
Priority date: 2021-09-03
Filing date: 2022-08-25
Publication date: 2026-04-07
Anticipated expiration: 2042-08-25
Also published as: US11989509B2; GB2624614A; DE112022004259T5; JP2024534796A; WO2023030157A1; US20230071456A1; GB202403627D0; GB2624614B

Description

本発明は、一般的には、デジタルスクリプトを修正するための方法に関し、特に、デジタルストーリーのテキストコンテンツに関連する画像シーケンスを生成および修正し、関連するデジタルテキストコンテンツを動的に変更することに関連するソフトウェア技術を改善するための方法および関連システムに関する。 This invention generally relates to methods for modifying digital scripts, and more particularly to methods and related systems for improving software techniques related to generating and modifying image sequences associated with the text content of a digital story, and dynamically changing the associated digital text content.

本発明の第１の態様は、コンピュータ可読メモリユニットに結合されたプロセッサを含む生成的敵対ネットワーク（ＧＡＮ）ハードウェア装置であって、メモリユニットは、プロセッサによって実行されたときに、自然言語処理（ＮＬＰ）を可能にするデジタルスクリプト修正方法を実施する命令を含み、方法は、プロセッサによって、デジタルストーリーのテキストコンテンツに関連する画像シーケンスを生成することと、プロセッサがＮＬＰコードを実行することを介して、テキストコンテンツ内の複数のコンテキスト次元を識別することと、プロセッサがユーザ入力に応答することによって、複数のコンテキスト次元の次元のグループを選択することと、プロセッサによって、画像シーケンスを次元のグループと組み合わせて拡大または縮小することと、プロセッサによって、次元のグループとの検出された相互作用に基づいて、画像シーケンスを変更することと、プロセッサがデジタルストーリーと画像シーケンスの提示中に、次元のグループから次元を抽出することと、プロセッサによって、次元を修正するために、デジタルストーリーのテキストコンテンツに関連するスクリプトライターを有効化することと、プロセッサによって、有効化に応答して生じる次元の修正に基づいて、画像シーケンスを修正することと、プロセッサによって、画像シーケンスの様々な画像シーケンスと対話し、複数のコンテキスト次元を変更するために、ハードウェアインタフェース装置を有効化することと、プロセッサが有効化に応答して、デジタルストーリーのテキストコンテンツを動的に変更することと、を含む、生成的敵対ネットワーク（ＧＡＮ）ハードウェア装置を提供する。 A first aspect of the present invention is a generative adversarial network (GAN) hardware device including a processor coupled to a computer-readable memory unit, wherein the memory unit includes instructions for performing a digital script modification method that enables natural language processing (NLP) when executed by the processor, the method comprising: generating an image sequence related to the text content of a digital story by the processor; identifying multiple contextual dimensions within the text content by the processor through the execution of NLP code; selecting a group of dimensions of the multiple contextual dimensions by the processor in response to user input; scaling the image sequence in combination with the group of dimensions by the processor; and processing the dimensions by the processor The present invention provides a generative adversarial network (GAN) hardware device that includes: modifying an image sequence based on detected interactions with groups; the processor extracting dimensions from groups of dimensions during the presentation of a digital story and image sequences; the processor enabling a scriptwriter associated with the text content of the digital story to modify dimensions; the processor modifying the image sequence based on the dimension modifications that occur in response to the activation; the processor enabling a hardware interface device to interact with various image sequences and modify multiple contextual dimensions; and the processor dynamically modifying the text content of the digital story in response to the activation.

本発明の第２の態様は、自然言語処理（ＮＬＰ）を可能にするデジタルスクリプト修正方法であって、生成的敵対ネットワーク（ＧＡＮ）ハードウェア装置のプロセッサによって、デジタルストーリーのテキストコンテンツに関連する画像シーケンスを生成することと、プロセッサがＮＬＰコードを実行することを介して、テキストコンテンツ内の複数のコンテキスト次元を識別することと、プロセッサがユーザ入力に応答することによって、複数のコンテキスト次元の次元のグループを選択することと、プロセッサによって、画像シーケンスを次元のグループと組み合わせて拡大または縮小することと、プロセッサによって、次元のグループとの検出された相互作用に基づいて、画像シーケンスを変更することと、プロセッサがデジタルストーリーと画像シーケンスの提示中に、次元のグループから次元を抽出することと、プロセッサによって、次元を修正するために、デジタルストーリーのテキストコンテンツに関連するスクリプトライターを有効化することと、プロセッサによって、有効化に応答して生じる次元の修正に基づいて、画像シーケンスを修正することと、プロセッサによって、画像シーケンスの様々な画像シーケンスと対話し、複数のコンテキスト次元を変更するために、ハードウェアインタフェース装置を有効化することと、プロセッサが有効化に応答して、デジタルストーリーのテキストコンテンツを動的に変更することと、を含む、自然言語処理（ＮＬＰ）を可能にするデジタルスクリプト修正方法を提供する。 A second aspect of the present invention provides a digital script modification method that enables natural language processing (NLP), comprising: generating an image sequence related to the text content of a digital story by a processor in a generative adversarial network (GAN) hardware device; identifying multiple contextual dimensions within the text content by the processor executing NLP code; selecting a group of dimensions of the multiple contextual dimensions by the processor in response to user input; scaling the image sequence in combination with the group of dimensions by the processor; modifying the image sequence based on detected interactions with the group of dimensions by the processor; extracting dimensions from the group of dimensions during the presentation of the digital story and the image sequence by the processor; activating a script writer related to the text content of the digital story to modify the dimensions by the processor; modifying the image sequence based on the dimension modifications that occur in response to the activation by the processor; activating a hardware interface device to interact with various image sequences of the image sequence and modify multiple contextual dimensions by the processor; and dynamically changing the text content of the digital story in response to the activation by the processor.

本発明の第３の態様は、コンピュータ可読プログラムコードを記憶するコンピュータ可読ハードウェア記憶装置を備え、コンピュータ可読プログラムコードは、サーバのプロセッサによって実行されたときに、自然言語処理（ＮＬＰ）を可能にするデジタルスクリプト修正方法を実施するアルゴリズムを含み、方法は、プロセッサによって、デジタルストーリーのテキストコンテンツに関連する画像シーケンスを生成することと、プロセッサがＮＬＰコードを実行することを介して、テキストコンテンツ内の複数のコンテキスト次元を識別することと、プロセッサがユーザ入力に応答することによって、複数のコンテキスト次元の次元のグループを選択することと、プロセッサによって、画像シーケンスを次元のグループと組み合わせて拡大または縮小することと、プロセッサによって、次元のグループとの検出された相互作用に基づいて、画像シーケンスを変更することと、プロセッサがデジタルストーリーと画像シーケンスの提示中に、次元のグループから次元を抽出することと、プロセッサによって、次元を修正するために、デジタルストーリーのテキストコンテンツに関連するスクリプトライターを有効化することと、プロセッサによって、有効化に応答して生じる次元の修正に基づいて、画像シーケンスを修正することと、プロセッサによって、画像シーケンスの様々な画像シーケンスと対話し、複数のコンテキスト次元を変更するために、ハードウェアインタフェース装置を有効化することと、プロセッサが有効化に応答して、デジタルストーリーのテキストコンテンツを動的に変更することと、を含む、コンピュータプログラム製品を提供する。 A third aspect of the present invention comprises a computer-readable hardware storage device for storing computer-readable program code, the computer-readable program code including an algorithm that implements a digital script modification method that enables natural language processing (NLP) when executed by a server processor, the method comprising: generating an image sequence related to the text content of a digital story by the processor; identifying multiple contextual dimensions within the text content by the processor through the execution of NLP code by the processor; selecting a group of dimensions of the multiple contextual dimensions by the processor in response to user input; and scaling the image sequence in combination with the group of dimensions by the processor. This provides a computer program product that includes: modifying an image sequence based on detected interactions with groups of dimensions; the processor extracting dimensions from groups of dimensions during the presentation of a digital story and image sequences; the processor enabling a script writer associated with the text content of the digital story to modify dimensions; the processor modifying the image sequence based on the dimension modifications that occur in response to the activation; the processor enabling hardware interface devices to interact with various image sequences and modify multiple contextual dimensions; and the processor dynamically changing the text content of the digital story in response to the activation.

本発明は、デジタルスクリプトの修正を自動化できる簡単な方法と関連システムを有利に提供する。 This invention advantageously provides a simple method and related system that can automate the modification of digital scripts.

本発明の実施形態による、デジタルストーリーのテキストコンテンツに関連する画像シーケンスを生成および修正し、関連するデジタルテキストコンテンツを動的に変更することに関連するソフトウェア技術を改善するためのシステムを示す。This invention presents a system for improving software techniques related to generating and modifying image sequences associated with text content of a digital story, and dynamically changing the associated digital text content, according to embodiments of the present invention. 本発明の実施形態による、デジタルストーリーのテキストコンテンツに関連する画像シーケンスを生成および修正し、関連するデジタルテキストコンテンツを動的に変更することに関連するソフトウェア技術を改善するための、図１のシステムによって可能になるプロセスフローを詳細に示すアルゴリズムを示す。This document details an algorithm that illustrates the process flow enabled by the system in Figure 1 for improving software techniques related to generating and modifying image sequences associated with the text content of a digital story and dynamically changing the associated digital text content, according to embodiments of the present invention. 本発明の実施形態による、図１のソフトウェア／ハードウェアの内部構造図である。This is an internal structure diagram of the software/hardware according to an embodiment of the present invention. 本発明の実施形態による、デジタルストーリーコンテンツのデジタルスクリプトを修正するためのＧＡＮモジュールおよびＮＬＰモジュールを含むシステムを示す。This invention illustrates a system, comprising a GAN module and an NLP module, for modifying the digital script of digital story content, according to an embodiment of the present invention. 本発明の実施形態による、デジタルスクリプトを変更し、対応する画像シーケンスを生成するプロセスを示す。This document illustrates a process for modifying a digital script and generating a corresponding image sequence, according to an embodiment of the present invention. 本発明の実施形態による、デジタルスクリプトを変更し、対応する画像シーケンスを生成するプロセスを示す。This document illustrates a process for modifying a digital script and generating a corresponding image sequence, according to an embodiment of the present invention. 本発明の実施形態による、デジタルスクリプトを変更し、対応する画像シーケンスを生成するプロセスを示す。This document illustrates a process for modifying a digital script and generating a corresponding image sequence, according to an embodiment of the present invention. 本発明の実施形態による、デジタルスクリプトを変更し、対応する画像シーケンスを生成するプロセスを示す。This document illustrates a process for modifying a digital script and generating a corresponding image sequence, according to an embodiment of the present invention. 本発明の実施形態による、図５のテキストから画像へのＧＡＮネットワークコンポーネントの詳細図である。This is a detailed diagram of the text-to-image GAN network component shown in Figure 5, according to an embodiment of the present invention. 本発明の実施形態による、図５の画像からテキストへのＧＡＮネットワークコンポーネントの詳細図である。This is a detailed diagram of the GAN network component for converting images to text according to an embodiment of the present invention, as shown in Figure 5. 本発明の実施形態による、デジタルストーリーのテキストコンテンツに関連する画像シーケンスを生成および修正し、関連するデジタルテキストコンテンツを動的に変更することに関連するソフトウェア技術を改善するために、図１のシステムによって使用されるコンピュータシステムを示す。To improve software techniques related to generating and modifying image sequences associated with the text content of a digital story and dynamically changing the associated digital text content, according to embodiments of the present invention, a computer system used by the system in Figure 1 is shown. 本発明の実施形態による、クラウドコンピューティング環境を示す。This document illustrates a cloud computing environment according to an embodiment of the present invention. 本発明の実施形態による、クラウドコンピューティング環境によって提供される機能抽象化レイヤのセットを示す。This describes a set of functional abstraction layers provided by a cloud computing environment according to embodiments of the present invention.

図１は、本発明の実施形態による、デジタルストーリーのテキストコンテンツに関連する画像シーケンスを生成および修正し、関連するデジタルテキストコンテンツを動的に変更することに関連するソフトウェア技術を改善するためのシステム１００を示す。典型的なアプリケーションスクリプトライティングシステムは、スクリプトライターエンティティが、画像シーケンス作成のためのテキストコンテンツ分析を必要とするビデオプレゼンテーションを視覚化することを必要とする場合がある。同様に、前述のプロセス中に、スクリプトライターエンティティは、生成された画像シーケンスを様々な次元に関して拡大するための要求を生成する場合がある。さらに、要求は、画像の修正に関連するテキストコンテンツの更新に関して、関連する画像を修正するためのコマンドと同時に、関連するコンテンツの要約を表示するための仕様を含むことができる。したがって、システムは、画像修正を実行するために生成された画像コンテンツを様々なコンテキスト次元内で拡大するために、スクリプトライターエンティティを有効化するように構成される。同様に、システム１００は、画像修正に関してテキストコンテンツを自動的に更新することを可能にする。 Figure 1 shows a system 100 for improving software techniques related to generating and modifying image sequences associated with text content of a digital story and dynamically changing the associated digital text content, according to an embodiment of the present invention. A typical application script writing system may require a script writer entity to visualize a video presentation that requires text content analysis for image sequence creation. Similarly, during the aforementioned process, the script writer entity may generate requests to expand the generated image sequence in terms of various dimensions. Furthermore, the requests may include specifications for displaying a summary of the associated content, along with commands to modify the associated images, in relation to updating the text content related to image modification. Therefore, the system is configured to enable the script writer entity to expand the generated image content in various contextual dimensions for performing image modification. Similarly, system 100 enables automatic updating of text content in relation to image modification.

システム１００は、デジタルテキストストーリーコンテンツを分析し、様々な関連するコンテキスト次元を識別し、生成的敵対ネットワーク（ＧＡＮ）の実行を介してテキストストーリーコンテンツに基づいて画像シーケンスを自動的に生成するための自然言語処理（ＮＬＰ）を可能にするシステムを含む。同様に、システム１００は、スクリプトライターエンティティが、デジタルテキストストーリーコンテンツを動的に更新するために、生成された画像を拡大もしくは変更またはその両方を行うことを可能にするように構成される。システム１００は、以下の機能を可能にする。 System 100 includes a system that analyzes digital text story content, identifies various relevant contextual dimensions, and enables natural language processing (NLP) to automatically generate image sequences based on the text story content through the execution of a generative adversarial network (GAN). Similarly, System 100 is configured to allow a scriptwriter entity to scale, modify, or bother with the generated images in order to dynamically update the digital text story content. System 100 enables the following functions:

システム１００は、（ＧＡＮの使用によってデジタルスクリプトコンテンツから画像シーケンスを生成するプロセス中に）デジタルスクリプトコンテンツから複数の可能なコンテキスト次元を識別するプロセスを可能にし、生成された画像シーケンスが選択された次元に関して拡大または縮小されるようにする。同様に、システム１００は、識別された次元との相互作用に基づいて（ＧＡＮの使用によって）画像シーケンスを変更するためのプロセスを可能にする。 System 100 enables the process of identifying multiple possible contextual dimensions from the digital script content (during the process of generating an image sequence from the digital script content using GANs), and ensures that the generated image sequence is scaled up or down with respect to the selected dimensions. Similarly, System 100 enables the process of modifying the image sequence (using GANs) based on interaction with the identified dimensions.

システム１００はさらに、生成された画像シーケンスとともに提示されるデジタルテキストストーリーコンテンツからコンテキスト次元を抽出するプロセスを可能にし、それにより、決定された必要性に基づいて次元を修正するために、スクリプトライターエンティティを有効化する。修正された次元は、画像シーケンスの修正を可能にする。システム１００は、生成された画像シーケンスに関して追加の次元を追加するために、スクリプトライターエンティティを有効化するように構成されることができる。新たに追加された次元は、生成された画像シーケンスに関して修正され、それによって画像シーケンスを変更し、（逆ＧＡＮモデルの実行を介して）関連するテキストスクリプトコンテンツを変更することができる。システム１００は、生成された画像シーケンスに対して１または複数のオブジェクトを選択的に変更、削除、もしくは追加、またはその組み合わせを行い、書き込まれたテキストストーリーコンテンツを動的に変更するために、スクリプトライターエンティティを有効化するようにさらに構成されてもよい。 System 100 further enables a process of extracting contextual dimensions from the digital text story content presented with the generated image sequence, thereby enabling scriptwriter entities to modify the dimensions based on determined needs. The modified dimensions enable modification of the image sequence. System 100 can be configured to enable scriptwriter entities to add additional dimensions to the generated image sequence. The newly added dimensions are modified with respect to the generated image sequence, thereby modifying the image sequence and (through the execution of an inverse GAN model) the associated text script content. System 100 may further be configured to enable scriptwriter entities to selectively modify, delete, add, or combine one or more objects to the generated image sequence, thereby dynamically modifying the written text story content.

スクリプトライターエンティティは、（画像シーケンスと対話しながら）複数の画像シーケンスを分割またはつなぎ合わせるために有効化され、それによって新しいストーリーコンテンツを作成するためのテキストストーリーコンテキストを分割またはマージする自動プロセスを可能にすることができる。バーチャルリアリティ（ＶＲ）ユーザインタフェースは、ユーザが様々な画像シーケンスと対話し、コンテキストの次元を変更し、テキストストーリーコンテンツを動的に変更することを可能にすることができる。 The scriptwriter entity can be enabled to split or join multiple image sequences (while interacting with the image sequence), thereby enabling an automated process of splitting or merging text story contexts to create new story content. A virtual reality (VR) user interface can allow users to interact with various image sequences, change the dimensions of the context, and dynamically modify the text story content.

図１のシステム１００は、ネットワーク１１７を介して相互接続されたＧＡＮハードウェア１３９、テキスト／デジタルストーリー入力コンポーネント１４０、ハードウェアインタフェース１１５、およびネットワークインタフェースコントローラ１５３を含む。ＧＡＮハードウェア１３９は、センサ１１２、回路１２７、およびソフトウェア／ハードウェア１２１を含む。ハードウェアインタフェースは、特に、仮想現実インタフェースなどを含む、任意のタイプのハードウェアベースのインタフェースを含み得る。ＧＡＮハードウェア１３９、テキスト／デジタルストーリー入力コンポーネント１４０、およびハードウェアインタフェース１１５はそれぞれ、組込み装置を含んでよい。本明細書において、組込み装置とは、特殊な機能を実行するために特別に設計された、コンピュータハードウェアとソフトウェア（固定機能またはプログラム可能）の組み合わせを含む専用装置またはコンピュータと定義される。プログラム可能な組込みコンピュータまたは装置は、特殊なプログラミングインタフェースを含んでいてもよい。一実施形態では、ＧＡＮハードウェア１３９、テキスト／デジタルストーリー入力コンポーネント１４０、およびハードウェアインタフェース１１５はそれぞれ、図１～図６に関して説明したプロセスを（独立して、または組み合わせて）実行するための、特殊な（非一般的な）ハードウェアおよび回路（すなわち、特殊な離散的な非一般的なアナログ、デジタル、および論理ベースの回路）を含む特殊なハードウェア装置を含むことができる。特殊な離散的な非一般的なアナログ、デジタル、およびロジックベースの回路（例えば、センサ１１２、回路／ロジック１２７、ソフトウェア／ハードウェア１２１など）は、独自の特別に設計されたコンポーネント（例えば、デジタルストーリーのテキストコンテンツに関連する画像シーケンスを生成および変更し、関連するデジタルテキストコンテンツを動的に変更することに関連するソフトウェア技術を改善するための自動化プロセスを実行するためだけに設計された、例えば特定用途向け集積回路（ＡＳＩＣ）などの特殊な集積回路）を含んでもよい。センサ１１２は、特に、ＧＰＳセンサ、Bluetoothビーコニングセンサ、携帯電話検出センサ、Ｗｉ－Ｆｉ測位検出センサ、三角測量検出センサ、活動追跡センサ、温度センサ、超音波センサ、光学センサ、ビデオ検索装置、湿度センサ、電圧センサ、ネットワークトラフィックセンサなどを含む、任意のタイプの内部センサまたは外部センサを含むことができる。ネットワーク１１７は、特に、ローカルエリアネットワーク（ＬＡＮ）、ワイドエリアネットワーク（ＷＡＮ）、インターネット、無線ネットワークなどを含む任意のタイプのネットワークを含むことができる。 The system 100 in Figure 1 includes GAN hardware 139, a text/digital story input component 140, a hardware interface 115, and a network interface controller 153 , all interconnected via a network 117. The GAN hardware 139 includes a sensor 112, a circuit 127, and software/hardware 121. The hardware interface may include any type of hardware-based interface, including, in particular, a virtual reality interface. The GAN hardware 139, the text/digital story input component 140, and the hardware interface 115 may each include an embedded device. In this specification, an embedded device is defined as a dedicated device or computer that includes a combination of computer hardware and software (fixed function or programmable) specifically designed to perform a particular function. A programmable embedded computer or device may include a special programming interface. In one embodiment, the GAN hardware 139, the text/digital story input component 140, and the hardware interface 115 may each include special hardware devices, including special (non-common) hardware and circuitry (i.e., special discrete non-common analog, digital, and logic-based circuits) for performing the processes described with respect to Figures 1 to 6 (independently or in combination). The special discrete non-common analog, digital, and logic-based circuits (e.g., sensor 112, circuitry/logic 127, software/hardware 121, etc.) may include uniquely designed components (e.g., special integrated circuits, such as application-specific integrated circuits (ASICs), designed solely to perform automated processes for generating and modifying image sequences related to the text content of a digital story and improving software techniques related to dynamically changing the associated digital text content). Sensor 112 may include any type of internal or external sensor, including, in particular, GPS sensors, Bluetooth beaconing sensors, mobile phone detection sensors, Wi-Fi positioning detection sensors, triangulation detection sensors, activity tracking sensors, temperature sensors, ultrasonic sensors, optical sensors, video search devices, humidity sensors, voltage sensors, and network traffic sensors. Network 117 may include any type of network, including, in particular, local area networks (LANs), wide area networks (WANs), the Internet, and wireless networks.

システム１００は、ストーリースクリプトのデジタルコンテンツのテキスト分析を実行するためのプロセスを（自然言語処理モデルの実行を介して）実行することが可能である。分析に基づいて、システム１００は、ストーリーコンテンツ内の様々な文脈的次元を識別するように構成される。同様に、システム１００は、特に、天候に関連する次元、イベントに関連する次元、場所に関連する次元、時間に関連する次元、物理的なＸ、Ｙ、Ｚの位置に基づく次元、速度に関連する次元などの様々な次元に関して知識コーパスを分析する。システム１００はさらに、特に、程度の低い悪天候対程度の高い悪天候など、コンテキスト次元の様々な程度を修正するように構成される。さらに、システム１００は、デジタルストーリースクリプトコンテンツから画像シーケンスを生成するように構成される。画像シーケンスは、テキストストーリースクリプトコンテンツから様々な可能性のある次元を識別するために有効である。 System 100 is capable of performing a process (through the execution of a natural language processing model) to perform text analysis of the digital content of the story script. Based on the analysis, System 100 is configured to identify various contextual dimensions within the story content. Similarly, System 100 analyzes a knowledge corpus with respect to various dimensions, particularly those related to weather, events, location, time, physical X, Y, Z position, and velocity. System 100 is further configured to correct for varying degrees of contextual dimensions, particularly between mild and severe weather. Furthermore, System 100 is configured to generate an image sequence from the digital story script content. The image sequence is useful for identifying various possible dimensions from the text story script content.

図２は、本発明の実施形態による、デジタルストーリーのテキストコンテンツに関連する画像シーケンスを生成および修正し、関連するデジタルテキストコンテンツを動的に変更することに関連するソフトウェア技術を改善するために、図１のシステム１００によって可能にされるプロセスフローを詳細に示すアルゴリズムを示す。図２のアルゴリズムにおける各ステップは、コンピュータコードを実行するコンピュータプロセッサによって任意の順序で有効化され、実行され得る。さらに、図２のアルゴリズムにおける各ステップは、ＧＡＮハードウェア１３９、テキスト／デジタルストーリー入力コンポーネント１４０、およびハードウェアインタフェース１１５によって有効化され、組み合わせて実行されてもよい。ステップ２００では、デジタルストーリーのテキストコンテンツに関連する画像シーケンスがＧＡＮハードウェア装置によって生成される。ステップ２０２では、テキストコンテンツ内の複数のコンテキスト次元が（ＮＬＰコードの実行を介して）識別される。コンテキスト次元は、特に、天候次元、イベント次元、位置次元、時間次元、物理的Ｘ、Ｙ、Ｚ位置次元、速度次元などを含む次元を含むことができる。 Figure 2 shows an algorithm detailing a process flow enabled by the system 100 of Figure 1 to improve software techniques related to generating and modifying image sequences associated with the text content of a digital story and dynamically changing the associated digital text content, according to embodiments of the present invention. Each step in the algorithm of Figure 2 can be enabled and executed in any order by a computer processor executing computer code. Furthermore, each step in the algorithm of Figure 2 may be enabled and executed in combination by the GAN hardware 139, the text/digital story input component 140 , and the hardware interface 115. In step 200, an image sequence associated with the text content of the digital story is generated by the GAN hardware device. In step 202, multiple context dimensions within the text content are identified (via the execution of NLP code). Context dimensions may include, in particular, dimensions such as weather dimensions, event dimensions, location dimensions, time dimensions, physical X, Y, Z location dimensions, and velocity dimensions.

ステップ２０４では、ユーザ入力に応答して、複数のコンテキスト次元の次元のグループが選択される。ステップ２０８では、画像シーケンスが次元のグループと組み合わせて拡大または縮小される。ステップ２１０では、次元のグループとの検出された相互作用に基づいて画像シーケンスが変更される。ステップ２１２では、デジタルストーリーと画像シーケンスの提示中に、次元のグループから次元が抽出される。ステップ２１４では、次元を修正するために、（デジタルストーリーのテキストコンテンツに関連する）スクリプトライターエンティティが有効化される。ステップ２１６において、画像シーケンスが、ステップ２１４の結果に応答して生じる次元の修正に基づいて修正される。ステップ２１８において、画像シーケンスの様々な画像シーケンスと対話し、複数のコンテキスト次元を変更するために、ハードウェアインタフェース装置が有効化される。ハードウェアインタフェース装置は、仮想現実（ＶＲ）インタフェース装置を含むことができる。ステップ２２０では、デジタルストーリーのテキストコンテンツが動的に変更される。 In step 204, a group of dimensions representing multiple contextual dimensions is selected in response to user input. In step 208, the image sequence is scaled up or down in combination with the group of dimensions. In step 210, the image sequence is modified based on the detected interaction with the group of dimensions. In step 212, dimensions are extracted from the group of dimensions during the presentation of the digital story and image sequence. In step 214, a scriptwriter entity (related to the text content of the digital story) is activated to modify the dimensions. In step 216, the image sequence is modified based on the dimensional modifications that occur in response to the results of step 214. In step 218, a hardware interface device is activated to interact with various image sequences of the image sequence and modify multiple contextual dimensions. The hardware interface device may include a virtual reality (VR) interface device. In step 220, the text content of the digital story is dynamically modified.

ステップ２２４では、スクリプトライターエンティティ機能が、以下の実装シナリオで説明されるように（ＧＡＮハードウェアを介して）有効化され得る。 In step 224, the script writer entity functionality can be enabled (via GAN hardware) as described in the following implementation scenario.

第１のシナリオでは、画像シーケンスが修正されるように、（ハードウェアインタフェース装置を介して）画像シーケンスに追加のコンテキスト次元を追加するために、スクリプトライターエンティティを有効化する。その後、画像シーケンスの修正の結果に関して逆ＧＡＮモデルを実行することによって、テキストコンテンツが修正される。 In the first scenario, a script writer entity is enabled to add an additional contextual dimension to the image sequence (via a hardware interface device) so that the image sequence is modified. The text content is then modified by running an inverse GAN model on the result of the image sequence modification.

第２のシナリオは、（ハードウェアインタフェース装置を介して）スクリプトライターエンティティを有効化した結果に関して逆ＧＡＮモデルの実行を介してテキストコンテンツが修正されるように、画像シーケンスの少なくとも１つの視覚オブジェクトを選択的に変更するために、スクリプトライターエンティティを有効化する。 The second scenario involves enabling the scriptwriter entity (via a hardware interface device) to selectively modify at least one visual object in an image sequence so that the text content is modified via the execution of an inverse GAN model with respect to the results of enabling the scriptwriter entity.

第３のシナリオは、（ハードウェアインタフェース装置を介して）スクリプトライターエンティティを有効化した結果に関して逆ＧＡＮモデルの実行を介してテキストコンテンツが修正されるように、画像シーケンスから少なくとも１つの視覚オブジェクトを選択的に取り除くために、スクリプトライターエンティティを有効化する。 The third scenario involves enabling the scriptwriter entity (via a hardware interface device) to selectively remove at least one visual object from an image sequence so that the text content is modified via the execution of an inverse GAN model with respect to the results of enabling the scriptwriter entity.

第４のシナリオは、（ハードウェアインタフェース装置を介して）、スクリプトライターエンティティを有効にした結果に関して逆ＧＡＮモデルの実行を介してテキストコンテンツが修正されるように、画像シーケンスに少なくとも１つの視覚オブジェクトを選択的に追加するために、スクリプトライターエンティティを有効化する。 The fourth scenario involves enabling the scriptwriter entity (via a hardware interface device) to selectively add at least one visual object to an image sequence so that the text content is modified via the execution of an inverse GAN model with respect to the results of enabling the scriptwriter entity.

第５のシナリオは、（前記様々な画像シーケンスとの対話中にハードウェアインタフェース装置を介して）画像シーケンスの複数の画像シーケンスを分割するために、スクリプトライターエンティティを有効化する。これに応答して、テキストコンテンツが分割され、デジタルストーリーのために新しいテキストコンテンツが生成される。 The fifth scenario involves enabling a script writer entity to split multiple image sequences (via a hardware interface device during interaction with the various image sequences). In response, the text content is split, and new text content is generated for the digital story.

第６のシナリオは、（前記様々な画像シーケンスとの対話中にハードウェアインタフェース装置を介して）画像シーケンスの複数の画像シーケンスをつなぎ合わせるために、スクリプトライターエンティティを有効化する。これに応答して、テキストコンテンツがマージされ、デジタルストーリーのための新しいテキストコンテンツが生成される。 The sixth scenario involves enabling a script writer entity to combine multiple image sequences (via a hardware interface device during interaction with the various image sequences). In response, text content is merged, generating new text content for the digital story.

図３は、本発明の実施形態による、図１のソフトウェア／ハードウェア１２１（すなわち、１２１）の内部構造図である。ソフトウェア／ハードウェア１２１は、識別モジュール３０４、変更モジュール３０５、抽出モジュール３０８、修正／有効化モジュール３１４、および通信コントローラ３１２を含む。識別モジュール３０４は、図２の識別ステップに関連するすべての機能を制御するための専用ハードウェアおよびソフトウェアを含む。変更モジュール３０５は、図２のアルゴリズムに関して説明した変更ステップに関連するすべての機能を制御するための専用ハードウェアおよびソフトウェアを含む。抽出モジュール３０８は、図２の抽出ステップに関連するすべての機能を制御するための専用ハードウェアおよびソフトウェアを含む。修正／有効化モジュール３１４は、図２のアルゴリズムの修正および有効化ステップに関連するすべての機能を制御するための専用ハードウェアおよびソフトウェアを含む。通信コントローラ３１２は、識別モジュール３０４、変更モジュール３０５、抽出モジュール３０８、および修正／有効化モジュール３１４間のすべての通信を制御するために有効化される。 Figure 3 is an internal structure diagram of the software/hardware 121 (i.e., 121) of Figure 1, according to an embodiment of the present invention. The software/hardware 121 includes an identification module 304, a modification module 305, an extraction module 308, a modification/activation module 314, and a communication controller 312. The identification module 304 includes dedicated hardware and software for controlling all functions related to the identification step in Figure 2. The modification module 305 includes dedicated hardware and software for controlling all functions related to the modification step described with respect to the algorithm in Figure 2. The extraction module 308 includes dedicated hardware and software for controlling all functions related to the extraction step in Figure 2. The modification/activation module 314 includes dedicated hardware and software for controlling all functions related to the modification and activation steps of the algorithm in Figure 2. The communication controller 312 is activated to control all communication between the identification module 304, the modification module 305, the extraction module 308, and the modification/activation module 314.

図４は、本発明の実施形態による、デジタルストーリーコンテンツのデジタルスクリプト４０５を修正するためのＧＡＮモジュール４０２ａおよびＮＬＰモジュール４０４を含むシステム４００を示す。システム４００は、デジタルスクリプト４０５から生成された画像シーケンス４０８に関連付けられた複数の可能な次元４１２を提示するように構成され、それによって、次元４１２を変更するために、スクリプトライターエンティティを有効化する。同様に、ＧＡＮモジュール４０２ａは、デジタルストーリーコンテンツが動的に更新されるように、生成された画像シーケンス４０８の画像を変更するように構成される。生成された画像シーケンス４０８の画像の変更は、修正済画像シーケンス４１０の生成をもたらす。ＧＡＮモジュール４０２ａは、入力テキストに対応する画像シーケンスを生成するために有効化される場合があり、結果として生成された画像は、ユーザによって閲覧されてもよい。 Figure 4 shows a system 400, according to an embodiment of the present invention, which includes a GAN module 402a and an NLP module 404 for modifying a digital script 405 of digital story content. The system 400 is configured to present multiple possible dimensions 412 associated with an image sequence 408 generated from the digital script 405, thereby enabling a scriptwriter entity to modify the dimensions 412. Similarly, the GAN module 402a is configured to modify the images in the generated image sequence 408 so that the digital story content is dynamically updated. Modification of the images in the generated image sequence 408 results in the generation of a modified image sequence 410. The GAN module 402a may be enabled to generate an image sequence corresponding to input text, and the resulting generated images may be viewed by the user.

ＮＬＰモジュール４０４は、入力テキストと、デジタルスクリプト４０５の様々な次元４１２（例えば、天候の次元、色の次元など）とを含む知識コーパスを検索するように構成されることがある。これに応答して、システム４００は、デジタルスクリプト４０５内で利用可能な次元を識別するために、入力テキストを分析する。入力テキストの様々な次元４１２および次元の相対的な程度が識別される。例えば、「天候」の次元は、特に、晴れ、曇り、風、雨など、それに関連する様々な程度を含んでいてもよい。識別された次元４１２および次元の相対的な程度は、ユーザのために表示されてもよい。同様に、ユーザは、選択に関して、次元４１２および次元の程度を変更（例えば、追加、更新、削除など）してもよい。（ＧＡＮモジュール４０２ａから）生成された画像、およびユーザによって選択された変更された次元および関連する次元の程度は、条件付き入力として第２のＧＡＮモジュール４０２ｂに送信される。システム４００は、入力（すなわち、ユーザが選択した画像および変更された次元）を検索するサイクル一貫敵対的ネットワークの使用を介して、条件付きテキストから画像への翻訳コードを実行するようにさらに構成される。ＧＡＮモジュール４０２ｂの）テキストから画像へのモジュールは、ユーザが選択した次元と程度に関して、入力画像の修正済バージョンを生成するために有効化される。逆ＧＡＮモジュール４０２ｃ（すなわち、画像からテキストへの変換モジュール）は、入力として修正済画像シーケンス４１０を取得し、関連するテキスト（すなわち、修正済スクリプト４１５）を生成する。修正済画像シーケンス４１０と対応する修正済スクリプト４１５は、デジタルスクリプトを確定するためにユーザによって利用される。 The NLP module 404 may be configured to retrieve input text and a knowledge corpus containing various dimensions 412 of the digital script 405 (e.g., weather dimension, color dimension, etc.). In response, the system 400 analyzes the input text to identify the dimensions available within the digital script 405. Various dimensions 412 of the input text and their relative degrees are identified. For example, the "weather" dimension may include various degrees associated with it, such as sunny, cloudy, windy, and rainy. The identified dimensions 412 and their relative degrees may be displayed for the user. Similarly, the user may modify (e.g., add, update, delete, etc.) the dimensions 412 and their degrees with respect to selection. The image generated (from the GAN module 402a), and the modified dimensions and associated dimensional degrees selected by the user, are sent to a second GAN module 402b as conditional input. System 400 is further configured to execute a conditional text-to-image translation code through the use of a cycle-consistent adversarial network that searches for input (i.e., user-selected images and modified dimensions). The text-to-image module (GAN module 402b) is enabled to generate a modified version of the input image with respect to user-selected dimensions and degree. The inverse GAN module 402c (i.e., the image-to-text translation module) takes the modified image sequence 410 as input and generates the associated text (i.e., the modified script 415). The modified image sequence 410 and the corresponding modified script 415 are used by the user to finalize the digital script.

図５Ａ～図５Ｄは、本発明の実施形態による、デジタルスクリプト５０２を変更し、対応する画像シーケンス５０４ａおよび５０４ｂを生成するためのプロセス５００を示す。プロセス５００は、テキストコンテンツ（すなわち、デジタルスクリプト５０２）が、デジタルスクリプト５０２のテキスト分析を（知識コーパス５０９に関して）実行するためのテキストから画像へのＧＡＮモジュール５０６およびＮＬＰモジュール５０７に入力として提供されるときに開始される。デジタルスクリプト５０２のテキスト分析に応答して、システム５００は、デジタルスクリプト５０２のストーリーコンテンツからさまざまなコンテキスト（および程度の次元）５１１を識別する。コンテキスト次元５１１は、特に、天候次元、イベント次元、位置次元、時間次元、物理的Ｘ、Ｙ、Ｚ位置次元、速度次元などを含み得る。システム５００はさらに、コンテキスト次元５１１の様々な程度を有効化する。その後、ＧＡＮモジュール５０６は、（ＧＡＮモジュール５０６の実行を介して）デジタルスクリプト５０２のテキストコンテンツから画像シーケンス５０４ａを生成し、テキストコンテンツから（コンテキスト次元５１１の）様々な可能な次元を識別する。画像シーケンス５０４ａは、ハードウェア／ソフトウェアインタフェース５１４（例えば、２Ｄディスプレイ、ＶＲ装置など）を介して表示されてもよい。その後、システム５００は、画像シーケンス５０４ａと共に提示される様々な次元の程度を変更するために、ストーリースクリプトライターエンティティ５１７が（テキストから画像へのＧＡＮネットワークコンポーネント５２２および画像からテキストへのＧＡＮネットワークコンポーネント５２４を介して）有効にされ得るように、画像シーケンス５０４ａと組み合わせて１または複数のコンテキスト次元を提示する。様々な変更済コンテキスト次元５１９の選択に応答して、システム５００は、関連するユーザ入力を受信し、変更済コンテキスト次元５１９を識別する。変更済コンテキスト次元５１９は、画像シーケンス５０４ａ内の画像を変更するために、現在の画像シーケンス（すなわち、画像シーケンス５０４ａ）を分析するために使用される。コンテキスト次元５１１のすべての修正は、画像シーケンス５０４ａ内の画像を更新するために考慮される。同様に、システム５００は、画像シーケンス５０４ａに追加の次元および選択された次元の程度を追加するために、スクリプトライターエンティティ５１７を有効化し、それに応じて画像シーケンス５０４ａの画像が更新される。画像シーケンス５０４ａの画像が変更されるように、画像シーケンス５０４ａから１または複数の画像オブジェクトを選択的に変更／追加／取り除くために、スクリプトライターエンティティ５１７が有効化され得る。スクリプトライターエンティティ５１７は、異なる画像を分割またはつなぎ合わせることができ、その結果、更新済画像シーケンス５０４ｂが生成され、ハードウェアソフトウェアインタフェース５２７を介して表示されることができる。変更プロセスが完了すると、システム４００は、画像シーケンス５０４ｂの修正済画像でデジタルスクリプト５０２を更新するプロセスを実行し、その結果、修正済デジタルスクリプト５２８が生成される。 Figures 5A to 5D show a process 500 for modifying a digital script 502 and generating corresponding image sequences 504a and 504b according to an embodiment of the present invention. Process 500 is initiated when text content (i.e., the digital script 502) is provided as input to the text-to-image GAN module 506 and NLP module 507 for performing text analysis of the digital script 502 (with respect to the knowledge corpus 509). In response to the text analysis of the digital script 502, the system 500 identifies various context (and degree dimensions) 511 from the story content of the digital script 502. Context dimensions 511 may include, in particular, weather dimensions, event dimensions, location dimensions, time dimensions, physical X, Y, Z location dimensions, velocity dimensions, etc. The system 500 further enables various degrees of the context dimensions 511. Subsequently, the GAN module 506 generates an image sequence 504a from the text content of the digital script 502 (through the execution of the GAN module 506) and identifies various possible dimensions (of the context dimension 511) from the text content. The image sequence 504a may be displayed via a hardware/software interface 514 (e.g., a 2D display, a VR device, etc.). The system 500 then presents one or more context dimensions in combination with the image sequence 504a so that the story script writer entity 517 can be enabled (through the text-to-image GAN network component 522 and the image-to-text GAN network component 524) to change the degree of the various dimensions presented with the image sequence 504a. In response to the selection of various modified context dimensions 519, the system 500 receives relevant user input and identifies the modified context dimension 519. The modified context dimension 519 is used to analyze the current image sequence (i.e., image sequence 504a) in order to modify the images within the image sequence 504a. All modifications to the context dimension 511 are considered for updating the images in image sequence 504a. Similarly, system 500 activates the scriptwriter entity 517 to add additional dimensions and selected dimensions to image sequence 504a, and the images in image sequence 504a are updated accordingly. The scriptwriter entity 517 may be activated to selectively modify/add/remove one or more image objects from image sequence 504a so that the images in image sequence 504a are changed. The scriptwriter entity 517 can split or join different images, resulting in the generation of an updated image sequence 504b, which can be displayed via the hardware-software interface 527. Once the modification process is complete, system 400 performs the process of updating the digital script 502 with the modified images of image sequence 504b, resulting in the generation of a modified digital script 528.

図６は、本発明の実施形態による、図５のテキストから画像へのＧＡＮネットワークコンポーネント５２２の詳細図である。ＧＡＮネットワークコンポーネント５２２は、第１ステージ６０２（ステージ１）と第２ステージ６０４（ステージ２）とを含む。第１ステージ６０２は、生成器Ｇ１、識別器Ｄ１の組を含む。同様に、第２ステージ６０４は、生成器Ｇ２および識別器Ｄ２の組を含む。生成器Ｇ１は低解像度画像６０７（例えば６４ｘ６４ｐｐｉ）を生成するように構成され、生成器Ｇ２は高解像度画像６０９（１２８ｘ１２８ｐｐｉ）を生成するように構成される。関連するテキスト埋め込みデータ６０５（すなわち、スクリプト）および関連するノイズは、第１ステージ６０２への入力として使用することができる。さらに、画像およびユーザが変更した次元および程度が、第１ステージ６０２への入力として使用されてもよい。生成器Ｇ１は、思考テキスト埋め込みをスキップし、合成画像（すなわち、低解像度画像６０７）を生成するように構成されてもよい。同様に、第１ステージ６０２の識別器Ｄ１は、同じテキスト埋め込みに関して条件付けされ、解像度６４ｘ６４ｐｐｉの実画像と合成画像の間を分類するように訓練される。生成器Ｇ１は、一連のアップサンプリングブロック６１１を含む。アップサンプリングブロック６１１は、入力を低解像度（６４ｘ６４ｐｐｉ画像）を含む３ｘ６４ｘ６４画像（すなわち、低解像度画像６０７）に投影するために、最近傍アップサンプリング処理に続く３ｘ３ストライド１畳み込み処理を可能にすることを含む。識別器Ｄ１は、入力を５１２ｘ４ｘ４の次元に投影する一連のダウンサンプリングブロック６１２を含む。前述の５１２ｘ４ｘ４の次元は、１２８次元の圧縮埋め込みと連結され、シグモイド層６１５を使用して、低解像度の画像を識別するために０（フェイク）と１（リアル）の間の出力を生成する。ステージ２生成器Ｇ１は、埋め込みとともにＩ１を入力とし、より高解像度の１２８ｘ１２８画像を生成する。 Figure 6 is a detailed view of the text-to-image GAN network component 522 of Figure 5, according to an embodiment of the present invention. The GAN network component 522 includes a first stage 602 (stage 1) and a second stage 604 (stage 2). The first stage 602 includes a pair of generator G1 and discriminator D1. Similarly, the second stage 604 includes a pair of generator G2 and discriminator D2. Generator G1 is configured to generate a low-resolution image 607 (e.g., 64 x 64 ppi), and generator G2 is configured to generate a high-resolution image 609 (128 x 128 ppi). Relevant text embedding data 605 (i.e., script) and associated noise can be used as input to the first stage 602. Furthermore, the image and user-modified dimensions and degree may also be used as input to the first stage 602. Generator G1 may be configured to skip thought text embedding and generate a composite image (i.e., the low-resolution image 607). Similarly, the first stage 602 classifier D1 is conditioned on the same text embeddings and trained to classify between real and composite images with a resolution of 64x64 ppi. The generator G1 includes a series of upsampling blocks 611. The upsampling blocks 611 include enabling a nearest-neighbor upsampling process followed by a 3x3 stride 1 convolution process to project the input onto a 3x64x64 image (i.e., the low-resolution image 607) containing the low-resolution image (64x64 ppi image). The classifier D1 includes a series of downsampling blocks 612 that project the input onto a 512x4x4 dimension. The aforementioned 512x4x4 dimension is concatenated with a 128-dimensional compressed embedding, and a sigmoid layer 615 is used to generate an output between 0 (fake) and 1 (real) to distinguish the low-resolution image. The stage 2 generator G1 takes I1 along with the embeddings as input and generates a higher-resolution 128x128 image.

生成器Ｇ２は、３ｘ６４ｘ６４の入力画像６２４を５１２ｘ１６ｘ１６の次元に投影する一連のダウンサンプリングブロック６２２を含む。その後、１２８次元の埋め込みが連結される。入力画像６２４は、（より高解像度の）画像６０９（すなわち、１２８ｘ１２８画像）を生成するために、一連のアップサンプリングブロック６２８が続く一連の残留ブロック６２６として伝送される。識別器Ｄ２は、画像６０９を（入力として）受信する。識別器Ｄ２は、一連のダウンサンプリングブロック６２９を含み、シグモイド層６３２が高解像度の画像（１２８ｘ１２８）を識別するために０（フェイク）と１（リアル）の間の出力を生成することを可能にする。したがって、第１ステージ６０２の出力は、ユーザが変更した次元を含む出力として、より高解像度の画像（すなわち、高解像度画像６０９）を生成するための第２ステージ６０４への入力として使用される。 The generator G2 includes a series of downsampling blocks 622 that project the 3x64x64 input image 624 to a 512x16x16 dimension. A 128-dimensional embedding is then concatenated. The input image 624 is transmitted as a series of residual blocks 626, followed by a series of upsampling blocks 628, to generate a (higher resolution) image 609 (i.e., a 128x128 image). The discriminator D2 receives image 609 (as input). The discriminator D2 includes a series of downsampling blocks 629, allowing the sigmoid layer 632 to generate an output between 0 (fake) and 1 (real) to distinguish the high resolution image (128x128). Therefore, the output of the first stage 602 is used as input to the second stage 604 to generate a higher resolution image (i.e., high resolution image 609), as an output including the user-modified dimension.

図７は、本発明の実施形態による、図５の画像からテキストへのＧＡＮネットワークコンポーネント５２４（すなわち、キャプションＧＡＮネットワークコンポーネント）の詳細図を示す。ＧＡＮネットワークコンポーネント５２４は、コンポーネント７０２およびコンポーネント７０４を含む。コンポーネント７０２は、キャプション７１１を出力するための入力として、畳み込みニューラルネットワーク（ＣＮＮ）特徴７０８およびノイズＺ７０９を検索するための長期短期記憶（ＬＳＴＭ）コンポーネント７０７ａ．．．７０７ｎを含むキャプション生成器を形成する。コンポーネント７０２への入力は、ユーザが変更した次元および程度を含む（図６のＧＡＮネットワークコンポーネント５２２から）出力された高解像度画像７１５を含む。 Figure 7 shows a detailed diagram of the image-to-text GAN network component 524 (i.e., the caption GAN network component) of Figure 5, according to an embodiment of the present invention. The GAN network component 524 includes components 702 and 704. Component 702 forms a caption generator that includes long-term short-term memory (LSTM) components 707a...707n for retrieving convolutional neural network (CNN) features 708 and noise Z709 as input for outputting a caption 711. The input to component 702 includes the output high-resolution image 715 (from the GAN network component 522 in Figure 6), including user-modified dimensions and degree.

コンポーネント７０４は、高解像度修正画像７１５ａのＣＮＮ特徴７１２とＬＳＴＭコンポーネント７０７ａ．．．７０７ｎからの出力に関してドット積を実行する識別器を形成する。ユーザの好みの次元と程度を含む高解像度修正画像７１５ａは、対応するものを生成するために（ＬＳＴＭコンポーネント７１７ａ．．．７１７ｎ）に関して規則的シーケンスモデリング処理を実行する入力として送信される。結果として得られる出力テキストスクリプト７２０は、ユーザが最終決定するために利用可能な、ユーザが更新した次元と程度を含む。 Component 704 forms a classifier that performs a dot product with respect to the CNN features 712 of the high-resolution modified image 715a and the outputs from the LSTM components 707a...707n. The high-resolution modified image 715a, containing the user's preferred dimensions and degrees, is sent as input to perform a regular sequence modeling process with respect to (LSTM components 717a...717n) to generate the corresponding. The resulting output text script 720 contains the user-updated dimensions and degrees, which are available for the user to make a final decision.

図８は、本発明の実施形態による、デジタルストーリーのテキストコンテンツに関連する画像シーケンスを生成および修正し、関連するデジタルテキストコンテンツを動的に変更することに関連するソフトウェア技術を改善するために、図１のシステム１００によって使用されるか、またはそれによって構成されるコンピュータシステム９０（例えば、図１のＧＡＮハードウェア１３９、テキスト／デジタルストーリー入力コンポーネント１４０、およびハードウェアインタフェース１１５）を示す。 Figure 8 shows a computer system 90 (e.g., the GAN hardware 139, text/digital story input component 140, and hardware interface 115 in Figure 1) used by or configured by the system 100 in Figure 1 to improve software techniques related to generating and modifying image sequences associated with text content of a digital story and dynamically changing the associated digital text content, according to embodiments of the present invention.

本発明の態様は、完全にハードウェアの実施形態、完全にソフトウェアの実施形態（ファームウェア、常駐ソフトウェア、マイクロコードなどを含む）、またはソフトウェアとハードウェアの態様を組み合わせた実施形態の形態をとることができ、これらはすべて、本明細書では一般に「回路」、「モジュール」、または「システム」と呼ばれる。 The embodiments of the present invention may take the form of entirely hardware embodiments, entirely software embodiments (including firmware, resident software, microcode, etc.), or embodiments combining software and hardware embodiments, all of which are generally referred to herein as “circuits,” “modules,” or “systems.”

本発明は、システム、方法もしくはコンピュータプログラム製品またはそれらの組み合せとすることができる。コンピュータプログラム製品は、プロセッサに本発明の態様を実行させるためのコンピュータ可読プログラム命令を記憶したコンピュータ可読記憶媒体を含んでよい。 The present invention may be a system, method, or computer program product, or a combination thereof. The computer program product may include a computer-readable storage medium storing computer-readable program instructions for causing a processor to execute aspects of the present invention.

コンピュータ可読記憶媒体は、命令実行装置によって使用される命令を保持し、記憶することができる有形の装置とすることができる。コンピュータ可読記憶媒体は、一例として、電子記憶装置、磁気記憶装置、光学記憶装置、電磁記憶装置、半導体記憶装置またはこれらの適切な組み合わせであってよいが、これらに限定されない。コンピュータ可読記憶媒体のより具体的な一例としては、ポータブルコンピュータディスケット、ハードディスク、ＲＡＭ、ＲＯＭ、ＥＰＲＯＭまたはフラッシュメモリ、ＳＲＡＭ、ＣＤ－ＲＯＭ、ＤＶＤ、メモリスティック、フロッピーディスク、パンチカードまたは溝内の隆起構造などに命令を記録した機械的に符号化された装置、およびこれらの適切な組み合せが挙げられる。本明細書で使用されるコンピュータ可読記憶媒体は、電波もしくは他の自由に伝播する電磁波、導波管もしくは他の伝送媒体を介して伝播する電磁波（例えば、光ファイバケーブルを通過する光パルス）、またはワイヤを介して送信される電気信号のような、一過性の信号それ自体として解釈されるべきではない。 A computer-readable storage medium can be a tangible device capable of holding and storing instructions used by an instruction execution device. Examples of computer-readable storage media include, but are not limited to, electronic storage devices, magnetic storage devices, optical storage devices, electromagnetic storage devices, semiconductor storage devices, or appropriate combinations thereof. More specific examples of computer-readable storage media include portable computer diskettes, hard disks, RAM, ROM, EPROM or flash memory, SRAM, CD-ROM, DVD, memory stick, floppy disk, punch cards or grooved raised structures, and mechanically encoded devices on which instructions are recorded, and appropriate combinations thereof. The computer-readable storage media used herein should not be interpreted as transient signals themselves, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., light pulses passing through optical fiber cables), or electrical signals transmitted through wires.

本明細書に記載のコンピュータ可読プログラム命令は、コンピュータ可読記憶媒体からそれぞれのコンピューティング／処理装置に、または、ネットワーク（例えば、インターネット、ローカルエリアネットワーク、ワイドエリアネットワーク、もしくはワイヤレスネットワークまたはその組み合わせ）を介して外部コンピュータまたは外部記憶装置にダウンロードすることができる。ネットワークは、銅線伝送ケーブル、光伝送ファイバー、無線伝送、ルーター、ファイアウォール、スイッチ、ゲートウェイコンピュータ、もしくはエッジサーバーまたはその組み合わせで構成される。各コンピューティング／処理装置のネットワークアダプタカードまたはネットワークインタフェースは、ネットワークからコンピュータ可読プログラム命令を受信し、それぞれのコンピューティング／処理装置内のコンピュータ可読記憶媒体に格納するためにコンピュータ可読プログラム命令を転送する。 The computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to each computing/processing unit, or to an external computer or external storage device via a network (e.g., the Internet, a local area network, a wide area network, or a wireless network, or a combination thereof). The network consists of copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers, or edge servers, or a combination thereof. The network adapter card or network interface of each computing/processing unit receives computer-readable program instructions from the network and transfers them for storage on the computer-readable storage medium within each computing/processing unit.

本発明の動作を実行するためのコンピュータ可読プログラム命令は、アセンブラ命令、命令セットアーキテクチャ（ＩＳＡ）命令、機械命令、機械依存命令、マイクロコード、ファームウェア命令、状態設定データ、またはＳｍａｌｌｔａｌｋ、Ｃ＋＋、spark、Ｒ言語などのオブジェクト指向プログラミング言語と「Ｃ」プログラミング言語や類似のプログラミング言語などの手続き型プログラミング言語を含む、１つ以上のプログラミング言語の任意の組み合わせで記述されたソースコードまたはオブジェクトコードのいずれかであってよい。コンピュータ可読プログラム命令は、スタンドアロンソフトウェアパッケージとして、完全にユーザのコンピュータ上で、または部分的にユーザのコンピュータ上で実行可能である。あるいは、部分的にユーザのコンピュータ上でかつ部分的にリモートコンピュータ上で、または完全にリモートコンピュータまたはサーバ上で実行可能である。後者のシナリオでは、リモートコンピュータは、ローカルエリアネットワーク（ＬＡＮ）またはワイドエリアネットワーク（ＷＡＮ）を含む任意のタイプのネットワークを介してユーザのコンピュータに接続され、または（例えば、インターネットサービスプロバイダーを使用したインターネット経由で）外部コンピュータに接続されてよい。いくつかの実施形態では、例えば、プログラマブルロジック回路、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）、またはプログラマブルロジックアレイ（ＰＬＡ）を含む電子回路は、本発明の態様を実行するために、コンピュータ可読プログラム命令の状態情報を利用してパーソナライズすることにより、コンピュータ可読プログラム命令を実行することができる。 The computer-readable program instructions for performing the operation of the present invention may be assembler instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including object-oriented programming languages such as Smalltalk, C++, Spark, and R, and procedural programming languages such as the C programming language or similar programming languages. The computer-readable program instructions are executable as a standalone software package, either entirely on the user's computer or partially on the user's computer. Alternatively, they may be executable partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer via any type of network, including a local area network (LAN) or a wide area network (WAN), or connected to an external computer (for example, via the Internet using an Internet service provider). In some embodiments, for example, electronic circuits including programmable logic circuits, field-programmable gate arrays (FPGAs), or programmable logic arrays (PLAs) can execute computer-readable program instructions by personalizing them using state information of computer-readable program instructions in order to perform aspects of the present invention.

本発明の態様は、本発明の実施形態による方法、装置（システム）、およびコンピュータプログラム製品のフローチャート図もしくはブロック図またはその両方を参照して本明細書に記載されている。フローチャート図もしくはブロック図またはその両方の各ブロック、およびフローチャート図もしくはブロック図またはその両方のブロックの組み合わせは、コンピュータ可読プログラム命令によって実装できることが理解されよう。 Aspects of the present invention are described herein with reference to flowcharts or block diagrams, or both, of methods, apparatus (systems), and computer program products according to embodiments of the present invention. It will be understood that each block in a flowchart or block diagram, or both, and any combination of blocks in a flowchart or block diagram, or both, can be implemented using computer-readable program instructions.

これらのコンピュータ可読プログラム命令は、コンピュータまたは他のプログラム可能なデータ処理装置のプロセッサを介して実行される命令がフローチャートもしくはブロック図またはその両方の１つまたは複数のブロックで指定された機能／動作を実装するための手段を生成するように、機械を生成するために汎用コンピュータ、専用コンピュータのプロセッサまたは他のプログラム可能なデータ処理装置に提供されることができる。これらのコンピュータ可読プログラム命令はまた、フローチャートもしくはブロック図またはその両方の１つまたは複数のブロックで指定された機能／行為の態様を実装する命令を含む生成品の１つを命令が記憶されたコンピュータ可読記憶媒体が構成するように、コンピュータ、プログラム可能なデータ処理装置、もしくは特定の方法で機能する他の装置またはその組み合わせに接続可能なコンピュータ可読記憶媒体の中に記憶されることができる。 These computer-readable program instructions can be provided to a general-purpose computer, a dedicated computer processor, or other programmable data processing device to generate a machine, such that instructions executed via the processor of a computer or other programmable data processing device generate means for implementing functions/operations specified in one or more blocks of a flowchart or block diagram, or both. These computer-readable program instructions can also be stored in a computer-readable storage medium that can be connected to a computer, a programmable data processing device, or other device or combination of devices that function in a particular way, such that the computer-readable storage medium on which the instructions are stored constitutes one of the outputs containing instructions that implement the modes of function/operations specified in one or more blocks of a flowchart or block diagram, or both.

コンピュータ、他のプログラム可能な装置、または他のデバイス上でフローチャートもしくはブロック図またはその両方の１つまたは複数のブロックで指定された機能／行為を実行する命令のように、コンピュータ可読プログラム命令はまた、コンピュータ、他のプログラム可能なデータ処理装置、または他のデバイスにロードされ、コンピュータ、他のプログラム可能な装置、または他のデバイス上で一連の操作ステップを実行し、コンピュータ実装された過程を生成することができる。 Computer-readable program instructions, like instructions that perform functions/actions specified in one or more blocks of a flowchart or block diagram, or both, on a computer, other programmable device, or other device, can also be loaded into a computer, other programmable data processing device, or other device and perform a series of operational steps on that device, generating a computer-implemented process.

図中のフローチャートおよびブロック図は、本発明の様々な実施形態によるシステム、方法、およびコンピュータプログラム製品が実行可能な実装の構成、機能、および動作を示している。これに関して、フローチャートまたはブロック図の各ブロックは、モジュール、セグメント、または命令の一部を表してよく、これは、指定された論理機能を実装するための１つまたは複数の実行可能命令を構成する。いくつかの代替の実施形態では、ブロックに示されている機能は、図に示されている順序とは異なる場合がある。例えば、連続して示される２つのブロックは、実際には、１つのステップとして達成される場合があり、同時に、実質的に同時に、部分的または全体的に時間的に重複する方法で実行されるか、またはブロックは、関係する機能に応じて逆の順序で実行される場合がある。ブロック図もしくはフローチャート図またはその両方の各ブロック、およびブロック図もしくはフローチャート図またはその両方のブロックの組み合わせは、指定された機能または動作を実行する、または特別な目的のハードウェアとコンピュータ命令の組み合わせを実行する特別な目的のハードウェアベースのシステムによって実装できることにも留意されたい。 The flowcharts and block diagrams in the figures illustrate the configuration, function, and operation of executable implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in a flowchart or block diagram may represent a module, segment, or part of an instruction, which constitutes one or more executable instructions for implementing a specified logical function. In some alternative embodiments, the functions shown in the blocks may differ from the order shown in the figures. For example, two consecutively shown blocks may actually be achieved as a single step, executed simultaneously, substantially simultaneously, partially or entirely in overlapping time, or the blocks may be executed in reverse order depending on the functions involved. It should also be noted that each block in a block diagram or flowchart, or both, and any combination of blocks in a block diagram or flowchart, or both, can be implemented by a special-purpose hardware-based system that performs a specified function or operation, or a combination of special-purpose hardware and computer instructions.

図８に示すコンピュータシステム９０は、プロセッサ９１と、プロセッサ９１に結合された入力装置９２と、プロセッサ９１に結合された出力装置９３と、プロセッサ９１にそれぞれ結合された記憶装置９４および９５とを含む。入力装置９２は、特に、キーボード、マウス、カメラ、タッチスクリーン等であってよい。出力装置９３は、特に、プリンタ、プロッタ、コンピュータスクリーン、磁気テープ、リムーバブルハードディスク、フロッピーディスク等である。記憶装置９４および９５は、特に、ハードディスク、フロッピーディスク、磁気テープ、コンパクトディスク（ＣＤ）またはデジタルビデオディスク（ＤＶＤ）などの光学記憶装置、ダイナミックランダムアクセスメモリ（ＤＲＡＭ）、読み取り専用メモリ（ＲＯＭ）などである。記憶装置９５は、コンピュータコード９７を含む。コンピュータコード９７は、デジタルストーリーのテキストコンテンツに関連する画像シーケンスを生成および修正し、関連するデジタルテキストコンテンツを動的に変更することに関連するソフトウェア技術を改善するためのアルゴリズム（例えば、図２のアルゴリズム）を含む。プロセッサ９１は、コンピュータコード９７を実行する。記憶装置９４は入力データ９６を含む。入力データ９６は、コンピュータコード９７によって要求される入力を含む。出力装置９３は、コンピュータコード９７からの出力を表示する。記憶装置９４および９５のいずれか一方または両方（または、読み取り専用記憶装置８５などの１または複数の追加の記憶装置）は、アルゴリズム（例えば、図２のアルゴリズム）を含み、その中に実装されたコンピュータ可読プログラムコードを有する、もしくはその中に格納された他のデータを有する、またはその両方であるコンピュータ使用可能媒体（または、コンピュータ可読媒体またはプログラム格納装置）として使用することができ、コンピュータ可読プログラムコードは、コンピュータコード９７を含む。一般に、コンピュータシステム９０のコンピュータプログラム製品（または、代替的に、製造品）は、コンピュータ使用可能媒体（または、プログラム記憶装置）を含み得る。 The computer system 90 shown in Figure 8 includes a processor 91, an input device 92 coupled to the processor 91, an output device 93 coupled to the processor 91, and storage devices 94 and 95, respectively, coupled to the processor 91. The input device 92 may be, in particular, a keyboard, mouse, camera, touchscreen, etc. The output device 93 may be, in particular, a printer, plotter, computer screen, magnetic tape, removable hard disk, floppy disk, etc. The storage devices 94 and 95 may be, in particular, optical storage devices such as hard disks, floppy disks, magnetic tape, compact discs (CDs) or digital video discs (DVDs), dynamic random access memory (DRAM), read-only memory (ROM), etc. The storage device 95 includes computer code 97. The computer code 97 includes algorithms (e.g., the algorithm in Figure 2) for improving software techniques related to generating and modifying image sequences related to the text content of a digital story and dynamically changing the associated digital text content. The processor 91 executes the computer code 97. The storage device 94 includes input data 96. Input data 96 includes inputs requested by computer code 97. Output device 93 displays the output from computer code 97. Either or both of storage devices 94 and 95 (or one or more additional storage devices such as read-only storage device 85) can be used as computer-readable media (or computer-readable media or program storage device) containing an algorithm (e.g., the algorithm in Figure 2), having computer-readable program code implemented therein, or having other data stored therein, or both, and the computer-readable program code includes computer code 97. Generally, a computer program product (or, alternatively, a manufactured product) of a computer system 90 may include computer-readable media (or program storage devices).

いくつかの実施形態では、ハードドライブ、光ディスク、または他の書き込み可能、書き換え可能、または取り外し可能なハードウェア記憶装置９５から記憶およびアクセスされるのではなく、記憶されたコンピュータプログラムコード８４（例えば、アルゴリズムを含む）は、読み取り専用記憶（ＲＯＭ）装置８５のような静的な、取り外し不可能な、読み取り専用記憶媒体に記憶されてもよく、またはそのような静的な、取り外し不可能な、読み取り専用媒体からプロセッサ９１によって直接アクセスされてもよい。同様に、いくつかの実施形態では、記憶されたコンピュータプログラムコード９７は、ハードドライブや光ディスクなどの、より動的な、または取り外し可能なハードウェアデータ記憶装置９５からではなく、コンピュータ可読ファームウェア８５として記憶されるか、またはかかるファームウェア８５からプロセッサ９１によって直接アクセスされる。 In some embodiments, instead of being stored and accessed from a hard drive, optical disc, or other writable, rewritable, or removable hardware storage device 95, the stored computer program code 84 (e.g., including algorithms) may be stored in a static, non-removable, read-only storage medium such as a read-only storage (ROM) device 85, or may be directly accessed by the processor 91 from such a static, non-removable, read-only medium. Similarly, in some embodiments, the stored computer program code 97 may be stored as computer-readable firmware 85, or directly accessed by the processor 91 from such firmware 85, rather than from a more dynamic or removable hardware data storage device 95 such as a hard drive or optical disc.

それでもなお、本発明の構成要素のいずれかは、デジタルストーリーのテキストコンテンツに関連する画像シーケンスを生成および修正し、関連するデジタルテキストコンテンツを動的に変更することに関連するソフトウェア技術を改善することを提供するサービスサプライヤによって、作成、統合、ホスト、維持、配備、管理、サービスなどされ得る。したがって、本発明は、コンピュータシステム９０にコンピュータ可読コードを統合することを含む、コンピューティングインフラストラクチャを展開、作成、統合、ホスティング、維持、もしくは統合、またはその組み合わせを行うためのプロセスを開示し、コンピュータシステム９０と組み合わせたコードは、デジタルストーリーのテキストコンテンツに関連する画像シーケンスを生成および修正し、関連するデジタルテキストコンテンツを動的に変更することに関連するソフトウェア技術を改善するプロセスを可能にするための方法を実行することができる。別の実施形態では、本発明は、サブスクリプション、広告、もしくは料金ベース、またはその組み合わせで本発明のプロセスステップを実行するビジネス方法を提供する。すなわち、ソリューションインテグレータなどのサービスサプライヤは、デジタルストーリーのテキストコンテンツに関連する画像シーケンスを生成および修正し、関連するデジタルテキストコンテンツを動的に変更することに関連するソフトウェア技術を改善するプロセスを可能にすることを提供し得る。この場合、サービスサプライヤは、１または複数の顧客のために本発明のプロセスステップを実行するコンピュータインフラストラクチャを作成、維持、サポートなどすることができる。その見返りとして、サービスサプライヤは、サブスクリプションもしくは料金契約、またはその両方に基づいて顧客から支払いを受けることができ、もしくはサービスサプライヤは、１または複数の第三者への広告コンテンツの販売から支払いを受けることができる、またはその両方である。 Nevertheless, any component of the present invention may be created, integrated, hosted, maintained, deployed, managed, serviced, etc. by a service supplier that provides services to improve software techniques related to generating and modifying image sequences associated with the text content of a digital story and dynamically changing the associated digital text content. Accordingly, the present invention discloses a process for deploying, creating, integrating, hosting, maintaining, or integrating, or a combination thereof, a computing infrastructure, which includes integrating computer-readable code into a computer system 90, and the code combined with the computer system 90 can perform methods to enable the process of improving software techniques related to generating and modifying image sequences associated with the text content of a digital story and dynamically changing the associated digital text content. In another embodiment, the present invention provides a business method for performing the process steps of the present invention on a subscription, advertising, or fee basis, or a combination thereof. That is, a service supplier, such as a solution integrator, may provide services to enable the process of improving software techniques related to generating and modifying image sequences associated with the text content of a digital story and dynamically changing the associated digital text content. In this case, the service supplier may create, maintain, support, etc., a computer infrastructure that performs the process steps of the present invention for one or more customers. In return, the service supplier may receive payments from customers based on a subscription, a fee agreement, or both, or from the sale of advertising content to one or more third parties, or both.

図８は、コンピュータシステム９０をハードウェアおよびソフトウェアの特定の構成として示しているが、当業者であれば知っているようなハードウェアおよびソフトウェアの任意の構成を、図８の特定のコンピュータシステム９０と組み合わせて上述した目的のために利用することができる。例えば、記憶装置９４および９５は、別々の記憶装置ではなく、単一の記憶装置の一部であってもよい。 Figure 8 shows computer system 90 as a specific hardware and software configuration; however, any hardware and software configuration known to those skilled in the art can be used in combination with the specific computer system 90 in Figure 8 for the purposes described above. For example, storage devices 94 and 95 may not be separate storage devices, but rather part of a single storage device.

＜クラウドコンピューティング環境＞
本開示はクラウドコンピューティングに関する詳細な説明を含むが、本明細書に記載した教示の実装形態はクラウドコンピューティング環境に限定されない。むしろ、本発明の実施形態は、現在公知のまたは将来開発される他の任意の種類のコンピュータ環境と共に実施することができる。 <Cloud Computing Environment>
This disclosure includes a detailed description of cloud computing, but the implementations of the teachings described herein are not limited to cloud computing environments. Rather, embodiments of the present invention can be implemented in any other type of computer environment that is currently known or may be developed in the future.

クラウドコンピューティングは、設定可能なコンピューティングリソースの共有プール（例えばネットワーク、ネットワーク帯域幅、サーバ、処理、メモリ、記憶装置、アプリケーション、仮想マシンおよびサービス）へ、簡便かつオンデマンドのネットワークアクセスを可能にするためのサービス提供のモデルであり、リソースは、最小限の管理労力または最小限のサービスプロバイダとのやり取りによって速やかに準備（provision）およびリリースできるものである。このクラウドモデルは、少なくとも５つの特性、少なくとも３つのサービスモデル、および少なくとも４つの実装モデルを含むことがある。 Cloud computing is a service delivery model that enables convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, network bandwidth, servers, processing, memory, storage, applications, virtual machines, and services) that can be rapidly provisioned and released with minimal administrative effort or interaction with service providers. This cloud model may include at least five characteristics, at least three service models, and at least four implementation models.

特性は以下の通りである。 The characteristics are as follows:

オンデマンド・セルフサービス：クラウドの消費者は、サービスプロバイダとの人的な対話を必要することなく、必要に応じて自動的に、サーバ時間やネットワークストレージなどのコンピューティング能力を一方的に準備することができる。 On-demand self-service: Cloud consumers can unilaterally prepare computing power, such as server time and network storage, automatically as needed, without requiring human interaction with service providers.

ブロード・ネットワークアクセス：コンピューティング能力はネットワーク経由で利用可能であり、また、標準的なメカニズムを介してアクセスできる。それにより、異種のシンまたはシッククライアントプラットフォーム（例えば、携帯電話、ラップトップ、ＰＤＡ）による利用が促進される。 Broad network access: Computing power is available over the network and accessible through standard mechanisms. This facilitates utilization by heterogeneous thin or thick client platforms (e.g., mobile phones, laptops, PDAs).

リソースプーリング：プロバイダのコンピューティングリソースはプールされ、マルチテナントモデルを利用して複数の消費者に提供される。様々な物理リソースおよび仮想リソースが、需要に応じて動的に割り当ておよび再割り当てされる。一般に消費者は、提供されたリソースの正確な位置を管理または把握していないため、位置非依存（location independence）の感覚がある。ただし消費者は、より高い抽象レベル（例えば、国、州、データセンタ）では場所を特定可能な場合がある。 Resource Pooling: A provider's computing resources are pooled and delivered to multiple consumers using a multi-tenant model. Various physical and virtual resources are dynamically allocated and reallocated as needed. Generally, consumers have a sense of location independence because they do not manage or know the exact location of the resources provided. However, consumers may be able to identify locations at higher levels of abstraction (e.g., country, state, data center).

迅速な柔軟性（elasticity）：コンピューティング能力は、迅速かつ柔軟に準備することができるため、場合によっては自動的に、直ちにスケールアウトし、また、速やかにリリースされて直ちにスケールインすることができる。消費者にとって、準備に利用可能なコンピューティング能力は無制限に見える場合が多く、任意の時間に任意の数量で購入することができる。 Rapid Flexibility: Computing power can be prepared quickly and flexibly, allowing it to scale out automatically and immediately, and to be quickly released and scale in immediately. For consumers, the computing power available for preparation often appears unlimited and can be purchased in any quantity at any time.

測定されるサービス：クラウドシステムは、サービスの種類（例えば、ストレージ、処理、帯域幅、アクティブユーザアカウント）に適したある程度の抽象化レベルでの測定機能を活用して、リソースの使用を自動的に制御し最適化する。リソース使用量を監視、制御、および報告して、利用されるサービスのプロバイダおよび消費者の両方に透明性を提供することができる。 Measured Services: Cloud systems leverage metric capabilities at a certain level of abstraction, appropriate for the type of service (e.g., storage, processing, bandwidth, active user accounts), to automatically control and optimize resource usage. Resource usage can be monitored, controlled, and reported, providing transparency to both service providers and consumers.

サービスモデルは以下の通りである。 The service model is as follows:

サービスとしてのソフトウェア（ＳａａＳ）：消費者に提供される機能は、クラウドインフラストラクチャ上で動作するプロバイダのアプリケーションを利用できることである。当該そのアプリケーションは、ウェブブラウザ（例えばウェブメール）などのシンクライアントインタフェースを介して、各種のクライアント装置からアクセスできる。消費者は、ネットワーク、サーバ、オペレーティングシステム、ストレージや、個別のアプリケーション機能さえも含めて、基礎となるクラウドインフラストラクチャの管理や制御は行わない。ただし、ユーザ固有の限られたアプリケーション構成の設定はその限りではない。 Software as a Service (SaaS): The functionality offered to consumers is the ability to use the provider's applications running on a cloud infrastructure. These applications can be accessed from various client devices via thin client interfaces such as web browsers (e.g., webmail). Consumers do not manage or control the underlying cloud infrastructure, including the network, servers, operating system, storage, or even individual application functions. However, this does not apply to configuring a limited number of user-specific application configurations.

サービスとしてのプラットフォーム（ＰａａＳ）：消費者に提供される機能は、プロバイダによってサポートされるプログラム言語およびツールを用いて、消費者が作成または取得したアプリケーションを、クラウドインフラストラクチャに展開（deploy）することである。消費者は、ネットワーク、サーバ、オペレーティングシステム、ストレージを含む、基礎となるクラウドインフラストラクチャの管理や制御は行わないが、展開されたアプリケーションを制御でき、かつ場合によってはそのホスティング環境の構成も制御できる。 Platform as a Service (PaaS): The functionality offered to consumers is the ability to deploy applications they have created or acquired to cloud infrastructure using programming languages and tools supported by the provider. Consumers do not manage or control the underlying cloud infrastructure, including networks, servers, operating systems, and storage, but they can control the deployed applications and, in some cases, the configuration of their hosting environment.

サービスとしてのインフラストラクチャ（ＩａａＳ）：消費者に提供される機能は、オペレーティングシステムやアプリケーションを含み得る任意のソフトウェアを消費者が展開および実行可能な、プロセッサ、ストレージ、ネットワーク、および他の基本的なコンピューティングリソースを準備することである。消費者は、基礎となるクラウドインフラストラクチャの管理や制御は行わないが、オペレーティングシステム、ストレージ、および展開されたアプリケーションを制御でき、かつ場合によっては一部のネットワークコンポーネント（例えばホストファイアウォール）を部分的に制御できる。 Infrastructure as a Service (IaaS): The functionality provided to consumers is the provision of processors, storage, networking, and other fundamental computing resources that enable consumers to deploy and run any software, including operating systems and applications. Consumers do not manage or control the underlying cloud infrastructure, but they can control the operating system, storage, and deployed applications, and in some cases, partially control certain network components (e.g., host firewalls).

展開モデルは以下の通りである。 The deployment model is as follows:

プライベートクラウド：このクラウドインフラストラクチャは、特定の組織専用で運用される。このクラウドインフラストラクチャは、当該組織または第三者によって管理することができ、オンプレミスまたはオフプレミスで存在することができる。 Private Cloud: This cloud infrastructure is operated exclusively for a specific organization. This cloud infrastructure can be managed by that organization or a third party and can reside on-premises or off-premises.

コミュニティクラウド：このクラウドインフラストラクチャは、複数の組織によって共有され、共通の関心事（例えば、ミッション、セキュリティ要件、ポリシー、およびコンプライアンス）を持つ特定のコミュニティをサポートする。このクラウドインフラストラクチャは、当該組織または第三者によって管理することができ、オンプレミスまたはオフプレミスで存在することができる。 Community Cloud: This cloud infrastructure is shared by multiple organizations to support specific communities with common interests (e.g., mission, security requirements, policies, and compliance). This cloud infrastructure can be managed by the organization or a third party and can reside on-premises or off-premises.

パブリッククラウド：このクラウドインフラストラクチャは、不特定多数の人々や大規模な業界団体に提供され、クラウドサービスを販売する組織によって所有される。 Public Cloud: This cloud infrastructure is provided to a large number of people or large industry groups and is owned by organizations that sell cloud services.

ハイブリッドクラウド：このクラウドインフラストラクチャは、２つ以上のクラウドモデル（プライベート、コミュニティまたはパブリック）を組み合わせたものとなる。それぞれのモデル固有の実体は保持するが、標準または個別の技術によってバインドされ、データとアプリケーションの可搬性（例えば、クラウド間の負荷分散のためのクラウドバースティング）を実現する。 Hybrid Cloud: This cloud infrastructure combines two or more cloud models (private, community, or public). While maintaining the unique entities of each model, they are bound together by standards or individual technologies to achieve data and application portability (e.g., cloud bursting for load balancing across clouds).

クラウドコンピューティング環境は、ステートレス性（statelessness）、低結合性（low coupling）、モジュール性（modularity）および意味論的相互運用性（semantic interoperability）に重点を置いたサービス指向型環境である。クラウドコンピューティングの中核にあるのは、相互接続されたノードのネットワークを含むインフラストラクチャである。 Cloud computing environments are service-oriented environments that emphasize statelessness, low coupling, modularity, and semantic interoperability. At the core of cloud computing is the infrastructure, including a network of interconnected nodes.

図９を参照すると、例示的なクラウドコンピューティング環境５０が示されている。図示するように、クラウドコンピューティング環境５０は１つまたは複数のクラウドコンピューティングノード１０を含む。これらに対して、クラウド消費者が使用するローカルコンピュータ装置（例えば、パーソナルデジタルアシスタント（ＰＤＡ）もしくは携帯電話５４Ａ、デスクトップコンピュータ５４Ｂ、ラップトップコンピュータ５４Ｃ、もしくは自動車コンピュータシステム５４Ｎまたはこれらの組み合わせなど）は通信を行うことができる。ノード１０は互いに通信することができる。ノード１０は、例えば、上述のプライベート、コミュニティ、パブリックもしくはハイブリッドクラウドまたはこれらの組み合わせなど、１つまたは複数のネットワークにおいて、物理的または仮想的にグループ化（不図示）することができる。これにより、クラウドコンピューティング環境５０は、サービスとしてのインフラストラクチャ、プラットフォームもしくはソフトウェアまたはこれらの組み合わせを提供することができ、クラウド消費者はこれらについて、ローカルコンピュータ装置上にリソースを維持する必要がない。なお、図９に示すコンピュータ装置５４Ａ、５４Ｂ、５４Ｃおよび５４Ｎの種類は例示に過ぎず、コンピューティングノード１０およびクラウドコンピューティング環境５０は、任意の種類のネットワークもしくはネットワークアドレス指定可能接続（例えば、ウェブブラウザの使用）またはその両方を介して、任意の種類の電子装置と通信可能であることを理解されたい。 Referring to Figure 9, an exemplary cloud computing environment 50 is shown. As illustrated, the cloud computing environment 50 includes one or more cloud computing nodes 10. Local computer devices used by cloud consumers (e.g., personal digital assistants (PDAs) or mobile phones 54A, desktop computers 54B, laptop computers 54C, or automotive computer systems 54N, or a combination thereof) can communicate with these nodes. The nodes 10 can communicate with each other. The nodes 10 can be grouped physically or virtually (not shown) in one or more networks, such as the private, community, public, or hybrid clouds or a combination thereof. This allows the cloud computing environment 50 to provide infrastructure, platforms, or software as a service, or a combination thereof, without requiring cloud consumers to maintain resources on their local computer devices. Note that the types of computer devices 54A, 54B, 54C, and 54N shown in Figure 9 are illustrative only, and it should be understood that the computing nodes 10 and the cloud computing environment 50 can communicate with any type of electronic device via any type of network or network addressable connection (e.g., using a web browser) or both.

図１０を参照すると、クラウドコンピューティング環境５０（図９）によって提供される機能的抽象化レイヤのセットが示されている。なお、図１０に示すコンポーネント、レイヤおよび機能は例示に過ぎず、本発明の実施形態はこれらに限定されないことをあらかじめ理解されたい。図示するように、以下のレイヤおよび対応する機能が提供される。 Referring to Figure 10, a set of functional abstraction layers provided by the cloud computing environment 50 (Figure 9) is shown. It should be understood that the components, layers, and functions shown in Figure 10 are illustrative only, and the embodiments of the present invention are not limited to these. As illustrated, the following layers and corresponding functions are provided:

ハードウェアおよびソフトウェアレイヤ６０は、ハードウェアコンポーネントおよびソフトウェアコンポーネントを含む。ハードウェアコンポーネントの例には、メインフレーム６１、縮小命令セットコンピュータ（ＲＩＳＣ）アーキテクチャベースのサーバ６２、サーバ６３、ブレードサーバ６４、記憶装置６５、ならびにネットワークおよびネットワークコンポーネント６６が含まれる。いくつかの実施形態において、ソフトウェアコンポーネントは、ネットワークアプリケーションサーバソフトウェア６７およびデータベースソフトウェア６８を含む。 The hardware and software layer 60 includes hardware and software components. Examples of hardware components include a mainframe 61, a reduced instruction set computer (RISC) architecture-based server 62, server 63, blade server 64, storage 65, and a network and network components 66. In some embodiments, the software components include network application server software 67 and database software 68.

仮想化レイヤ７０は、抽象化レイヤを提供する。当該レイヤから、例えば以下の仮想エンティティを提供することができる：仮想サーバ７１、仮想ストレージ７２、仮想プライベートネットワークを含む仮想ネットワーク７３、仮想アプリケーションおよびオペレーティングシステム７４、ならびに仮想クライアント７５。 The virtualization layer 70 provides an abstraction layer. From this layer, for example, the following virtual entities can be provided: a virtual server 71, virtual storage 72, a virtual network 73 including a virtual private network, a virtual application and operating system 74, and a virtual client 75.

一例として、管理レイヤ８０は以下の機能を提供することができる。リソース準備８１は、クラウドコンピューティング環境内でタスクを実行するために利用されるコンピューティングリソースおよび他のリソースの動的な調達を可能にする。計量および価格設定８２は、クラウドコンピューティング環境内でリソースが利用される際のコスト追跡、およびこれらのリソースの消費に対する請求またはインボイス送付を可能にする。一例として、これらのリソースはアプリケーションソフトウェアのライセンスを含んでよい。セキュリティは、データおよび他のリソースに対する保護のみならず、クラウドコンシューマおよびタスクの識別確認を可能にする。ユーザポータル８３は、コンシューマおよびシステム管理者にクラウドコンピューティング環境へのアクセスを提供する。サービスレベル管理８７は、要求されたサービスレベルが満たされるように、クラウドコンピューティングリソースの割り当ておよび管理を可能にする。サービス品質保証（ＳＬＡ）の計画および履行８８は、ＳＬＡに従って将来必要になると予想されるクラウドコンピューティングリソースの事前手配および調達を可能にする。 As an example, the management layer 80 can provide the following functions: Resource preparation 81 enables the dynamic procurement of computing resources and other resources used to perform tasks within the cloud computing environment. Metering and pricing 82 enables cost tracking when resources are used within the cloud computing environment and billing or invoicing for the consumption of these resources. For example, these resources may include application software licenses. Security enables not only protection of data and other resources, but also identification and verification of cloud consumers and tasks. The user portal 83 provides consumers and system administrators with access to the cloud computing environment. Service level management 87 enables the allocation and management of cloud computing resources to ensure that requested service levels are met. Service Level Assurance (SLA) planning and execution 88 enables the pre-arrangement and procurement of cloud computing resources expected to be needed in the future in accordance with the SLA.

ワークロードレイヤ１０１は、クラウドコンピューティング環境が利用可能な機能の例を提供する。このレイヤから提供可能なワークロードおよび機能の例には、マッピングおよびナビゲーション１０２、ソフトウェア開発およびライフサイクル管理１０３、仮想教室教育の配信１３３、データ分析処理１３４、取引処理１０６、ならびに、デジタルストーリーのテキストコンテンツに関連する画像シーケンスの生成および修正、および関連するデジタルテキストコンテンツの動的変更１０７に関連するソフトウェア技術の改良が含まれる。 Workload layer 101 provides examples of the capabilities available in a cloud computing environment. Examples of workloads and capabilities available from this layer include mapping and navigation 102, software development and lifecycle management 103, virtual classroom education delivery 133, data analysis processing 134, transaction processing 106, and improvements to software technologies related to the generation and modification of image sequences associated with the text content of digital stories, and the dynamic modification of associated digital text content 107.

本発明の実施形態を説明の目的で本明細書に記載したが、当業者には多くの修正および変更が明らかになるであろう。したがって、添付の特許請求の範囲は、本発明の範囲内に入る全てのかかる修正および変更を包含することを意図している。 While embodiments of the present invention have been described herein for illustrative purposes, many modifications and changes will become apparent to those skilled in the art. Therefore, the appended claims are intended to encompass all such modifications and changes that fall within the scope of the present invention.

Claims

A generative adversarial network (GAN) hardware device including a processor coupled to a computer-readable memory unit, wherein the memory unit includes instructions that, when executed by the processor, perform a digital script modification method that enables natural language processing (NLP), and the method is
The aforementioned processor generates an image sequence related to the text content of the digital story,
The processor identifies multiple contextual dimensions within the text content by executing NLP code,
The processor selects a group of dimensions of the plurality of context dimensions in response to user input,
The processor enlarges or reduces the image sequence by combining it with the group of dimensions,
The processor modifies the image sequence based on the detected interaction with the group of dimensions,
The processor extracts dimensions from the group of dimensions during the presentation of the digital story and the image sequence,
The processor enables the scriptwriter associated with the text content of the digital story in order to correct the dimension,
The processor modifies the image sequence based on the dimensional correction that occurs in response to the activation.
The processor enables a hardware interface device to interact with various image sequences of the image sequence and to change the multiple context dimensions.
The processor dynamically changes the text content of the digital story in response to the activation,
Generative Adversarial Network (GAN) hardware devices, including those mentioned above.

The GAN hardware device according to claim 1, wherein the plurality of context dimensions include dimensions selected from the group consisting of weather dimensions, event dimensions, location dimensions, time dimensions, physical X, Y, Z location dimensions, and velocity dimensions.

The above method further,
The processor enables the script writer via the hardware interface device to add an additional contextual dimension to the image sequence,
The processor performs a first modification to the image sequence with respect to the additional context dimension,
The processor, which executes an inverse GAN model with respect to the result of the first modification, performs a second modification on the text content.
A GANG hardware device according to claim 1, including the above.

The above method further,
The processor enables the script writer via the hardware interface device to selectively modify at least one visual object in the image sequence,
The text content is modified by the processor that executes an inverse GAN model with respect to the result of the script writer being activated,
A GANG hardware device according to claim 1, including the above.

The above method further,
The processor enables the script writer via the hardware interface device to selectively remove at least one visual object from the image sequence,
The text content is modified by the processor that executes an inverse GAN model with respect to the result of the script writer being activated,
A GANG hardware device according to claim 1, including the above.

The above method further,
The processor enables the script writer via the hardware interface device to selectively add at least one visual object to the image sequence,
The text content is modified by the processor that executes an inverse GAN model with respect to the result of the script writer being activated,
A GANG hardware device according to claim 1, including the above.

The above method further,
The processor enables the script writer to divide the image sequence into multiple image sequences via the hardware interface device during interaction with the various image sequences,
The processor divides the text content in response to the result of enabling the script writer,
The processor generates new text content for the digital story in response to the division,
A GANG hardware device according to claim 1, including the above.

The above method further,
The processor enables the script writer via the hardware interface device to concatenate multiple image sequences of the image sequence during interaction with the various image sequences,
The processor, in response to the activation result, merges the text content,
The processor generates new text content for the digital story in response to the merge,
A GANG hardware device according to claim 1, including the above.

The GAN hardware device according to claim 1, wherein the hardware interface device comprises a virtual reality (VR) interface device.

A digital script modification method that enables natural language processing (NLP),
The processor of a Generative Adversarial Network (GAN) hardware device generates image sequences related to the text content of a digital story,
The processor identifies multiple contextual dimensions within the text content by executing NLP code,
The processor selects a group of dimensions of the plurality of context dimensions in response to user input,
The processor enlarges or reduces the image sequence by combining it with the group of dimensions,
The processor modifies the image sequence based on the detected interaction with the group of dimensions,
The processor extracts dimensions from the group of dimensions during the presentation of the digital story and the image sequence,
The processor enables the scriptwriter associated with the text content of the digital story in order to correct the dimension,
The processor modifies the image sequence based on the dimensional correction that occurs in response to the activation.
The processor enables a hardware interface device to interact with various image sequences of the image sequence and to change the multiple context dimensions.
The processor dynamically changes the text content of the digital story in response to the activation,
Methods that include...

The method according to claim 10, wherein the plurality of context dimensions include dimensions selected from the group consisting of weather dimensions, event dimensions, location dimensions, time dimensions, physical X, Y, Z location dimensions, and velocity dimensions.

The above method further,
The processor enables the script writer via the hardware interface device to add an additional contextual dimension to the image sequence,
The processor performs a first modification to the image sequence with respect to the additional context dimension,
The processor, which executes an inverse GAN model with respect to the result of the first modification, performs a second modification on the text content.
The method according to claim 10, including the method described in claim 10.

The processor enables the script writer via the hardware interface device to selectively modify at least one visual object in the image sequence,
The text content is modified by the processor that executes an inverse GAN model with respect to the result of the script writer being activated,
The method according to claim 10, further comprising:

The processor enables the script writer via the hardware interface device to selectively remove at least one visual object from the image sequence,
The text content is modified by the processor that executes an inverse GAN model with respect to the result of the script writer being activated,
The method according to claim 10, further comprising:

The processor enables the script writer via the hardware interface device to selectively add at least one visual object to the image sequence,
The text content is modified by the processor that executes an inverse GAN model with respect to the result of the script writer being activated,
The method according to claim 10, further comprising:

The processor enables the script writer to divide the image sequence into multiple image sequences via the hardware interface device during interaction with the various image sequences,
The processor divides the text content in response to the result of enabling the script writer,
The processor generates new text content for the digital story in response to the division,
The method according to claim 10, further comprising:

The processor enables the script writer via the hardware interface device to concatenate multiple image sequences of the image sequence during interaction with the various image sequences,
The processor, in response to the activation result, merges the text content,
The processor generates new text content for the digital story in response to the merge,
The method according to claim 10, further comprising:

The method according to claim 10, wherein the hardware interface device comprises a virtual reality (VR) interface device.

The method according to claim 10, further comprising providing a computer system that provides at least one support service for at least one of the creation, integration, hosting, maintenance, and deployment of computer-readable code, wherein the computer-readable code is executed by the processor and provides the processor to perform the generation, the identification, the selection, the expansion or reduction, the modification, the extraction, the activation of the script writer, the correction, the activation of the hardware interface device, and the dynamic modification.

The computer-readable program code includes an algorithm that performs a digital script modification method that enables natural language processing (NLP) when executed by the server's processor, and the method is
The aforementioned processor generates an image sequence related to the text content of the digital story,
The processor identifies multiple contextual dimensions within the text content by executing NLP code,
The processor selects a group of dimensions of the plurality of context dimensions in response to user input,
The processor enlarges or reduces the image sequence by combining it with the group of dimensions,
The processor modifies the image sequence based on the detected interaction with the group of dimensions,
The processor extracts dimensions from the group of dimensions during the presentation of the digital story and the image sequence,
The processor enables the scriptwriter associated with the text content of the digital story in order to correct the dimension,
The processor modifies the image sequence based on the dimensional correction that occurs in response to the activation.
The processor enables a hardware interface device to interact with various image sequences of the image sequence and to change the multiple context dimensions.
The processor dynamically changes the text content of the digital story in response to the activation,
A computer program that includes [this].