CN116741330A

CN116741330A - Diagnosis and treatment report generation method, device, equipment and storage medium

Info

Publication number: CN116741330A
Application number: CN202310532844.XA
Authority: CN
Inventors: 张膂; 张海洋; 冉勇
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2023-05-11
Filing date: 2023-05-11
Publication date: 2023-09-12

Abstract

The disclosure provides a diagnosis and treatment report generation method, device, equipment and storage medium, relates to the technical field of artificial intelligence, in particular to the technical field of virtual digital people, AI large models and the like, and can be applied to a scene of psychological treatment assistance. The specific implementation scheme comprises the following steps: constructing a target virtual object according to feature data of the target object, wherein the feature data comprises at least one of appearance feature data and voiceprint feature data; displaying the target virtual object; driving the target virtual object according to driving information so that the first user communicates with the second user through the target virtual object, wherein the driving information is generated according to an communication instruction of the first user; and generating a diagnosis and treatment report according to the communication data between the second user and the target virtual object. The present disclosure can improve therapeutic efficiency.

Description

Diagnosis and treatment report generation method, device, equipment and storage medium

Technical Field

The disclosure relates to the technical field of artificial intelligence, in particular to the technical field of virtual digital people, AI large models and the like, and can be applied to the scene of psychological treatment assistance, in particular to a diagnosis and treatment report generation method, device, equipment and storage medium.

Background

When a psychological doctor treats psychological diseases, the psychological problems of the patient are required to be led through language, the illness state is judged according to the complaint of the patient, and a corresponding treatment scheme is formulated.

When some patients complain, psychological unconscious resistance can appear due to the identity of a psychological doctor, the psychological doctor cannot be well complained of the psychological doctor, and the psychological doctor needs to guide for many times to lead the patients to complain of the real psychological problem, so that the treatment efficiency is low.

Disclosure of Invention

The disclosure provides a diagnosis and treatment report generation method, device, equipment and storage medium, which can improve treatment efficiency.

According to a first aspect of the present disclosure, there is provided a diagnosis and treatment report generating method, including:

constructing a target virtual object according to feature data of the target object, wherein the feature data comprises at least one of appearance feature data and voiceprint feature data; displaying the target virtual object; driving the target virtual object according to driving information so that the first user communicates with the second user through the target virtual object, wherein the driving information is generated according to an communication instruction of the first user; and generating a diagnosis and treatment report according to the communication data between the second user and the target virtual object.

According to a second aspect of the present disclosure, there is provided a diagnostic report generating apparatus, the apparatus comprising: the device comprises a display module, a driving module and a generating module.

The display module is used for constructing a target virtual object according to the characteristic data of the target object, wherein the characteristic data comprises at least one of appearance characteristic data and voiceprint characteristic data; and displaying the target virtual object.

And the driving module is used for driving the target virtual object according to the driving information so that the first user communicates with the second user through the target virtual object, and the driving information is generated according to the communication instruction of the first user.

And the generation module is used for generating a diagnosis and treatment report according to the communication data between the second user and the target virtual object.

According to a third aspect of the present disclosure, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method as in the first aspect.

According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method according to the first aspect.

According to a fifth aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method according to the first aspect.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

fig. 1 is a schematic flow chart of a diagnosis and treatment report generating method according to an embodiment of the disclosure;

FIG. 2 is a schematic flow chart of S101 in FIG. 1 according to an embodiment of the disclosure;

FIG. 3 is a schematic flow chart of S103 in FIG. 1 according to an embodiment of the present disclosure;

fig. 4 is a schematic diagram of the components of a diagnostic report generating apparatus according to an embodiment of the present disclosure;

fig. 5 is a schematic diagram of the composition of an electronic device according to an embodiment of the disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

It should be appreciated that in embodiments of the present disclosure, the character "/" generally indicates that the context associated object is an "or" relationship. The terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated.

For example, some patients experience emotional or other accidents, which results in insufficient mental safety, very distrust to the psychological doctor, inability to nor willing to communicate with the psychological doctor too much, and thus the psychological doctor cannot determine the symptoms of psychological problems according to the patient's complaints, and the treatment efficiency is low.

Under the background technology, the present disclosure provides a diagnosis and treatment report generating method, device, equipment and storage medium, which can improve treatment efficiency.

The execution main body of the diagnosis and treatment report generation method provided by the embodiment of the disclosure may be a computer or a server, or may also be other electronic devices with data processing capability; alternatively, the execution subject of the method may be a processor (e.g., a central processing unit (central processing unit, CPU)) in the above-described electronic device; still alternatively, the execution subject of the method may be an Application (APP) installed in the electronic device and capable of implementing the function of the method; alternatively, the execution subject of the method may be a functional module, a unit, or the like having the function of the method in the electronic device. The subject of execution of the method is not limited herein.

The diagnosis report generation method is exemplarily described below with reference to the accompanying drawings.

Fig. 1 is a flow chart of a diagnosis and treatment report generating method according to an embodiment of the present disclosure. As shown in fig. 1, the method may include:

s101, constructing a target virtual object according to feature data of the target object, wherein the feature data comprises at least one of appearance feature data and voiceprint feature data.

In the case of a psychological therapy, the target object may be a subject that the patient is willing to complain with, or may be a subject that is determined based on basic information of the patient, past medical information, or the like.

Illustratively, the target virtual object may include an appearance model and a voice model, and the appearance model may be a 3d or 2d cartoon or anthropomorphic or realistic model constructed according to appearance characteristic data; the speech model may be trained from voiceprint feature data. The voice model has a function of inputting voice data or text data and outputting audio corresponding to voiceprint feature data.

Taking the target object as a certain human example, the appearance characteristic data can comprise facial characteristic data, physical characteristic data, clothing characteristic data and the like, and the voiceprint characteristic data can be determined according to voice data in video, audio and the like containing the target object.

S102, displaying the target virtual object.

For example, the image of the target virtual object may be displayed by a display device provided on the electronic apparatus, and the sound of the target virtual object may be presented by an audio output device provided on the electronic apparatus. The display device and the audio output device may be integrally provided, or may be independently provided, which is not limited herein. Illustratively, the display device may be a display, a projection (e.g., three-dimensional projection, planar projection, etc.), an AR or VR device, etc., and the audio output device may be a headset, speaker, etc.

And S103, driving the target virtual object according to driving information so that the first user communicates with the second user through the target virtual object, wherein the driving information is generated according to the communication instruction of the first user.

The communication instruction is an instruction corresponding to the first user when the first user expects to obtain the expected information from the second user. For example, the communication instruction may be a corresponding expression, action, or voice made when the first user communicates with the second user according to the desire to obtain the desired information, that is, the first user may control the target virtual object to communicate with the second user instead of himself by inputting the communication instruction.

The driving information is information which can be read and executed by the electronic equipment based on the communication instruction, so that the electronic equipment can directly drive the target virtual object to display corresponding expression, action and voice according to the driving information.

For example, when the communication instruction is an expression, an action, or a voice of the first user, the driving information may be directly generated according to the expression, the action, or the voice of the first user, so as to drive the target virtual object to display the expression, the action, or the voice corresponding to the communication instruction.

When the electronic equipment displays the expression, the action and the voice corresponding to the target virtual object and the communication instruction, the second user can communicate with the first user through the target virtual object displayed by the electronic equipment.

The electronic device may obtain, through the camera and the microphone, expression, motion, and speech uttered by the second user during the communication process, so as to obtain communication data generated by the second user during the communication process, and may output the communication data through a display device and an audio output device provided by the electronic device. The first user can acquire the communication data through the display device and the audio output device arranged on the electronic equipment, so that communication with the second user can be continued according to the communication data. The display device and the audio output device for outputting the ac data may be the same as or different from the display device and the audio output device for displaying the target virtual object.

For example, in a psychology therapy scenario, a first user may be a psychological doctor and a second user may be a patient in need of therapy.

S104, generating a diagnosis and treatment report according to the communication data between the second user and the target virtual object.

For example, the communication data may be video data including an expression and/or an action of the second user and/or audio data including a voice of the second user, which are acquired by the electronic device during communication between the second user and the target virtual object.

For example, a knowledge graph and treatment data about the psychological disease may be previously established, and a diagnosis report may be generated according to the knowledge graph and treatment data of the psychological disease, user information of the second user, and communication data between the second user and the target virtual object.

According to the embodiment of the disclosure, the target virtual object corresponding to the target object is constructed, and the target virtual object is driven according to the driving information generated by the communication instruction of the first user, so that the first user can communicate with the second user through the target virtual object, the distance between the first user and the second user is shortened, the second user can better complain about psychological problems, the treatment efficiency is improved, and then according to the communication data between the second user and the target virtual object, a diagnosis and treatment report about the second user can be obtained, references can be provided for a diagnosis and treatment process, and the treatment efficiency is further improved.

In one possible embodiment, the target virtual object includes an appearance model and a speech model. Fig. 2 is a schematic flow chart of S101 in fig. 1 according to an embodiment of the disclosure. As shown in fig. 2, S101 may include:

s201, acquiring characteristic data of a target object.

The feature data of the target object may be acquired by acquiring features of the target object, or may be acquired according to data such as an image and voice of the target object.

S202, constructing and obtaining an appearance model of the target virtual object according to appearance characteristic data in the characteristic data.

The outline model may be a 3d or 2d model of an image obtained from outline feature data. The appearance model can be driven by the driving information to make corresponding expression and action.

For example, when the feature data of the target object includes appearance feature data (for example, the target object is any object with an entity image), an image of the target object may be obtained according to the appearance feature data, and a 3d or 2d model of the image may be built, which is an appearance model of the target virtual object.

For example, when the feature data of the target object does not include the outline feature data (e.g., the target object is an object without an entity figure or the outline feature data is not acquired), the outline model of the target virtual object may not be constructed.

And/or S203, constructing a voice model of the target virtual object according to the voiceprint feature data in the feature data.

Wherein, the voice model can be used for outputting the audio conforming to the voiceprint characteristics of the target object according to the input voice or words.

For example, when the feature data of the target object includes voiceprint feature data (for example, the target object is a person or a cartoon object with a dubbing actor for performing voice dubbing), the voiceprint feature data may be used to train a preset initial voice model, and the trained voice model is the voice model of the target virtual object.

For example, when the feature data of the target object does not include voiceprint feature data (for example, the target object is an animal or other things incapable of communicating through voice), a voice model trained by the voiceprint feature of the target object may not be constructed, but a voice model capable of converting input text and outputting corresponding voice may be constructed, so that communication efficiency between the target object and the second user through the corresponding target virtual object is improved when the target object does not have voice communication capability.

When the appearance model and/or the speech model are obtained based on S202 and S203, the appearance model and/or the speech model may be used as the target virtual object.

According to the embodiment of the disclosure, the appearance model and/or the voice model of the target object are built, so that the target virtual object containing the appearance characteristics and/or the voiceprint characteristics of the target object can be accurately built, and the target virtual object is more similar to the target object. Therefore, the first user can communicate with the second user more easily through the target virtual object, and the treatment efficiency is further improved.

In one possible embodiment, the driving information may be generated according to at least one of expression, motion and voice input by the first user.

At least one item of information of expression, action and voice input by the first user is the communication instruction.

The expression and action input by the first user can be exemplified by a video acquired by a camera or other image acquisition equipment, and the expression feature extraction and the action feature extraction of the video are carried out; the voice input by the first user may be obtained by collecting the audio of the first user through a microphone or other audio collection devices and extracting the voice from the audio.

By way of example, taking information input by a first user as an action and voice, respectively generating action driving information and voice driving information according to the action and voice of the first user, driving a target virtual object according to the action driving information to make an action consistent with the action input by the first user, and simultaneously driving the target virtual object according to the voice driving information to output audio consistent with voice content input by the first user.

According to the embodiment of the disclosure, driving information corresponding to the expression, the action and the voice input by the first user can be generated according to at least one piece of information of the expression, the action and the voice input by the first user. Therefore, the expression, action and voice content of the target virtual object are consistent with those of the first user, the first user can directly drive the target virtual object to communicate with the second user, the content displayed by the target virtual object is more consistent with the expectations of the first user, and the communication efficiency is improved.

In a possible embodiment, fig. 3 is a schematic flow chart of S103 in fig. 1 provided in the embodiment of the disclosure. As shown in fig. 3, S103 may include:

s301, acquiring an exchange instruction of a first user.

The communication instruction is an instruction corresponding to the first user when the first user expects to obtain the expected information from the second user. Specifically, the communication instruction may be a corresponding instruction obtained by the first user according to the communication direction of the communication content to be performed by the first user, for example, when the first user wants to know the mood state of the second user, the communication instruction may be "acquire mood state". For another example, when the first user wants to learn the emotional experience of the second user, the communication instruction may be "get emotional experience".

S302, according to an exchange instruction of a first user, generating AI driving information through an AI large model, wherein the AI large model is a deep learning model trained by using image data of a target object in advance, and the image data comprises at least one of audio data, video data, picture data and text data.

The AI large model is a pre-trained deep learning model of transformation class containing parameters of billions, and is trained according to image data of a target object, so that the AI large model can obtain communication content conforming to the image of the target object according to an input instruction, and corresponding AI driving information is generated according to the communication content so as to drive the target virtual object.

For example, an initial large model in the related art (e.g., a religion ^TM ) And performing fine-tuning (fine-tune) to obtain the AI large model.

The initial large model may be a deep learning model capable of generating corresponding communication content according to an input instruction, so that the initial large model is further trained based on image data of a target object to obtain an AI large model, the AI large model can be provided with communication content which is consistent with the image of the target object according to the input instruction, and corresponding AI driving information is generated according to the communication content to drive the target virtual object.

When the electronic equipment receives the communication instruction of the first user, the electronic equipment can process the communication instruction through the AI large model to generate AI driving information.

Illustratively, taking the acquired communication instruction of the first user as an example of "acquiring mood state", the AI large model generates AI-driving information corresponding to an expression (e.g., "puzzling" expression), an action (e.g., "flexible head" action), and a voice (e.g., "how recently you are not happy") corresponding to the communication instruction, which corresponds to the target object, according to the communication instruction.

S303, driving the target virtual object according to the AI driving information.

For example, according to the AI driving information, the target virtual object may be driven to exhibit the expression, the motion, and the voice conforming to the expression, the motion, and the voice of the target object and related to the communication instruction.

According to the embodiment of the disclosure, the AI driving information is generated according to the communication instruction of the first user through the AI large model, and the target virtual object is driven, so that the target virtual object is more similar to the target object, the second user is more authorized to the target virtual object, the second user can more easily complain, and the treatment efficiency is further improved. And because the AI driving information is automatically generated according to the communication instruction input by the first user through the AI large model, the first user does not need to frequently input the communication instruction to control the target virtual object, so that the first user can control the target virtual object to communicate with the second user more conveniently.

In a possible embodiment, before driving the target virtual object according to the AI driving information in the foregoing embodiment, the method may further include:

and updating the AI driving information according to the adjustment information input by the first user.

For example, the first user may obtain, according to a display device and an audio output device provided on the electronic device, information (expression, action, and voice) to be displayed by the AI-driven information-driven target virtual object, and when the first user finds that the information to be displayed by the target virtual object is wrong or does not conform to expectations, may input adjustment information to the electronic device, where the adjustment information may be at least one of expression, action, and voice input to the electronic device by the first user, for example. And the electronic equipment updates the AI driving information into driving information corresponding to the adjustment information according to the adjustment information input by the first user.

According to the embodiment of the disclosure, the AI driving information is updated according to the adjustment information input by the first user, so that the AI driving information can be updated to the driving information expected by the first user, and the treatment effect is ensured.

In a possible embodiment, the generating a diagnosis report according to the communication data between the second user and the target virtual object in the foregoing embodiment may include:

and generating a diagnosis and treatment report according to the pre-constructed psychological disease core knowledge graph, the pre-constructed psychological disease core treatment data, the pre-acquired user information of the second user and the communication data between the second user and the target virtual object.

Illustratively, the user information of the second user may include age, gender, historical treatment data, etc., without limitation herein.

For example, the communication data between the second user and the target virtual object may be input into a diagnosis and treatment report generation model trained in advance according to the psychological disease core knowledge graph, the psychological disease core treatment data, and the user information of the second user, so as to generate a diagnosis and treatment report.

The method comprises the steps of firstly extracting key information from communication data between a second user and a target virtual object through a key information extraction model, and then inputting the key information into a diagnosis and treatment report generation model trained in advance according to a psychological disease core knowledge graph, psychological disease core treatment data and user information of the second user to generate a diagnosis and treatment report. For example, taking the communication data between the second user and the target virtual object as video data including the action of the second user as an example, the key action can be extracted through the key information extraction model, and then the key action is input into the trained diagnosis and treatment report generation model to generate the diagnosis and treatment report.

The initial model may be trained based on pre-constructed communication data samples and diagnosis and treatment reports corresponding to the communication data samples obtained according to a psychological disease core knowledge graph, psychological disease core treatment data and user information of the second user, so as to obtain a diagnosis and treatment report generation model capable of generating corresponding diagnosis and treatment reports after receiving the communication data.

According to the embodiment of the disclosure, the diagnosis and treatment report can be accurately generated according to the pre-constructed psychological disease core knowledge graph, the pre-constructed psychological disease core treatment data, the pre-acquired user information of the second user and the communication data between the second user and the target virtual object, so that the accuracy of the diagnosis and treatment report is improved.

In a possible embodiment, the target object may be an object selected by the second user, or an object determined according to user information of the second user.

The object selected by the second user is an object specified by the second user and conforming to the self-expectation of the second user, for example, an intimate object in the second user.

Illustratively, the user information of the second user may include a family context, emotional experience, social relationship, etc. of the second user, without limitation herein.

According to the embodiment of the disclosure, the target object which the second user is willing to communicate with can be accurately obtained through the selection of the second user or according to the user information of the second user. Therefore, the target virtual object constructed by the target object by the first user can be communicated with the second user more easily, and the treatment efficiency is further improved.

The foregoing description of the embodiments of the present disclosure has been presented primarily in terms of methods. To achieve the above functions, it includes corresponding hardware structures and/or software modules that perform the respective functions. Those of skill in the art will readily appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is implemented as hardware or computer software driven hardware depends upon the particular application and design constraints imposed on the solution. The technical aim may be to use different methods to implement the described functions for each particular application, but such implementation should not be considered beyond the scope of the present disclosure.

In an exemplary embodiment, the embodiment of the present disclosure further provides a diagnosis and treatment report generating apparatus, which may be used to implement the diagnosis and treatment report generating method as in the foregoing embodiment.

Fig. 4 is a schematic diagram of the components of the diagnosis and treatment report generating apparatus according to the embodiment of the present disclosure. As shown in fig. 4, the apparatus may include: a presentation module 401, a driving module 402 and a generating module 403.

The display module 401 is configured to construct a target virtual object according to feature data of the target object, where the feature data includes at least one of profile feature data and voiceprint feature data; and displaying the target virtual object.

The driving module 402 is configured to drive the target virtual object according to driving information, so that the first user communicates with the second user through the target virtual object, and the driving information is generated according to an communication instruction of the first user.

The generating module 403 generates a diagnosis and treatment report according to the communication data between the second user and the target virtual object.

In a possible implementation manner, the target virtual object includes an appearance model and a voice model, and the display module 401 is specifically configured to:

acquiring characteristic data of a target object; constructing and obtaining an appearance model of the target virtual object according to appearance feature data in the feature data; and/or constructing a voice model of the target virtual object according to voiceprint feature data in the feature data.

In one possible implementation, the driving information is generated according to at least one of expression, action and voice input by the first user.

In a possible implementation, the driving module 402 is specifically configured to:

acquiring an exchange instruction of a first user; generating AI driving information through an AI large model according to an exchange instruction of a first user, wherein the AI large model is a deep learning model trained by using image data of a target object in advance, and the image data comprises at least one of audio data, video data, picture data and text data; and driving the target virtual object according to the AI driving information.

In a possible implementation, the driving module 402 is further configured to:

before the target virtual object is driven according to the AI driving information, the AI driving information is updated according to the adjustment information of the first user on the AI driving information.

In a possible implementation manner, the generating module 403 is specifically configured to:

In a possible implementation manner, the target object is an object selected by the second user or is an object determined according to user information of the second user.

It should be noted that the division of the modules in fig. 4 is schematic, and is merely a logic function division, and other division manners may be implemented in practice. For example, two or more functions may also be integrated in one processing module. The embodiments of the present disclosure are not limited in this regard. The integrated modules may be implemented in hardware or in software functional modules.

In the technical scheme of the disclosure, the acquisition, storage, application and the like of the related user personal information all conform to the regulations of related laws and regulations, and the public sequence is not violated.

According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.

In an exemplary embodiment, an electronic device includes: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described in the above embodiments. The electronic device may be the computer or server described above.

In an exemplary embodiment, the readable storage medium may be a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method according to the above embodiment.

In an exemplary embodiment, the computer program product comprises a computer program which, when executed by a processor, implements the method according to the above embodiments.

Fig. 5 illustrates a schematic block diagram of an example electronic device 500 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 5, the electronic device 500 includes a computing unit 501 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 502 or a computer program loaded from a storage unit 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the electronic device 500 may also be stored. The computing unit 501, ROM 502, and RAM 503 are connected to each other by a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.

A number of components in electronic device 500 are connected to I/O interface 505, including: an input unit 506 such as a keyboard, a mouse, etc.; an output unit 507 such as various types of displays, speakers, and the like; a storage unit 508 such as a magnetic disk, an optical disk, or the like; and a communication unit 509 such as a network card, modem, wireless communication transceiver, etc. The communication unit 509 allows the electronic device 500 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

The computing unit 501 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 501 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 501 performs the respective methods and processes described above, such as a diagnosis report generation method. For example, in some embodiments, the diagnostic report generation method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 508. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 500 via the ROM 502 and/or the communication unit 509. When the computer program is loaded into the RAM 503 and executed by the computing unit 501, one or more steps of the above-described diagnosis and treatment report generation method may be performed. Alternatively, in other embodiments, the computing unit 501 may be configured to perform the diagnostic report generating method by any other suitable means (e.g. by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.

The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. A method of diagnostic report generation, the method comprising:

constructing a target virtual object according to feature data of the target object, wherein the feature data comprises at least one of appearance feature data and voiceprint feature data;

displaying the target virtual object;

driving the target virtual object according to driving information so that a first user communicates with a second user through the target virtual object, wherein the driving information is generated according to an communication instruction of the first user;

and generating a diagnosis and treatment report according to the communication data between the second user and the target virtual object.

2. The method of claim 1, the target virtual object comprising an appearance model and a speech model, the constructing the target virtual object from feature data of the target object comprising:

acquiring characteristic data of the target object;

according to the appearance characteristic data in the characteristic data, constructing and obtaining an appearance model of the target virtual object;

and/or constructing a voice model of the target virtual object according to the voiceprint feature data in the feature data.

3. The method of claim 1 or 2, the driving information being generated from at least one of expression, motion, and voice information input by the first user.

4. The method according to claim 1 or 2, the driving the target virtual object according to driving information, comprising:

acquiring an exchange instruction of the first user;

generating AI driving information through an AI large model according to the communication instruction of the first user, wherein the AI large model is a deep learning model trained by using image data of the target object in advance, and the image data comprises at least one of audio data, video data, picture data and text data;

and driving the target virtual object according to the AI driving information.

5. The method of claim 4, prior to said driving the target virtual object according to AI drive information, the method further comprising:

and updating the AI driving information according to the adjustment information of the first user on the AI driving information.

6. The method of any of claims 1-5, the generating a diagnostic report from the communication data between the second user and the target virtual object, comprising:

7. The method according to any one of claims 1-6, wherein the target object is an object selected by the second user or is an object determined according to user information of the second user.

8. A diagnostic report generating apparatus, the apparatus comprising:

the display module is used for constructing a target virtual object according to the characteristic data of the target object, wherein the characteristic data comprises at least one of appearance characteristic data and voiceprint characteristic data; displaying the target virtual object;

the driving module is used for driving the target virtual object according to driving information so that a first user can communicate with a second user through the target virtual object, and the driving information is generated according to an communication instruction of the first user;

9. The apparatus of claim 8, the target virtual object comprising an appearance model and a speech model, the presentation module being specifically configured to:

acquiring characteristic data of the target object;

10. The apparatus of claim 8 or 9, the driving information is generated according to at least one of expression, motion, and voice input by the first user.

11. The device according to claim 8 or 9, the driving module being in particular configured to:

acquiring an exchange instruction of the first user;

and driving the target virtual object according to the AI driving information.

12. The apparatus of claim 11, the drive module further to:

and updating the AI driving information according to the adjustment information of the first user on the AI driving information before the target virtual object is driven according to the AI driving information.

13. The apparatus according to any of claims 8-12, the generating module being specifically configured to:

14. The apparatus according to any of claims 8-13, wherein the target object is an object selected by the second user or is an object determined based on user information of the second user.

15. An electronic device, comprising: at least one processor; and a memory communicatively coupled to the at least one processor;

wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.

16. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of claims 1-7.

17. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any of claims 1-7.