Skip to content

ldm.util #328

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 10 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,39 +4,21 @@

See our paper: [<font size=5>Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models</font>](https://arxiv.org/abs/2303.04671)

## Demo
<img src="./assets/demo_short.gif" width="750">

## System Architecture


<p align="center"><img src="./assets/figure.jpg" alt="Logo"></p>

## Intro
I implement a google-colab version under standard GPU environment.
I just use two models `T2I` and `ImageCaption` to process images because of my insufficient GPU memory.
You can try my colab notebook here

## Quick Start

```
# create a new environment
conda create -n visgpt python=3.8

# activate the new environment
conda activate visgpt

# prepare the basic environments
pip install -r requirement.txt
[![Open 2k image generation in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1bl-JAgrUru9GlGsb9hrcZCqj3uI6ev_O?usp=sharing)
## Demo
`T2I`

# download the visual foundation models
bash download.sh
<img src="./assets/dog-meme.jpg" width="750">

# prepare your private openAI private key
export OPENAI_API_KEY={Your_Private_Openai_Key}
`ImageCaption`

# create a folder to save images
mkdir ./image
<img src="./assets/football.jpg" width="750">

# Start Visual ChatGPT !
python visual_chatgpt.py
```

## GPU memory usage
Here we list the GPU memory usage of each visual foundation model, one can modify ``self.tools`` with fewer visual foundation models to save your GPU memory:
Expand Down
Binary file added assets/dog-meme.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/football.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
16 changes: 8 additions & 8 deletions download.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@ ln -s ControlNet/ldm ./ldm
ln -s ControlNet/cldm ./cldm
ln -s ControlNet/annotator ./annotator
cd ControlNet/models
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_canny.pth
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_depth.pth
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_hed.pth
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_mlsd.pth
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_normal.pth
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_openpose.pth
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_scribble.pth
wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_seg.pth
#wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_canny.pth
#wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_depth.pth
#wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_hed.pth
#wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_mlsd.pth
#wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_normal.pth
#wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_openpose.pth
#wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_scribble.pth
#wget https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/control_sd15_seg.pth
cd ../../
2 changes: 1 addition & 1 deletion requirement.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ torchvision==0.13.1
numpy==1.23.1
transformers==4.26.1
albumentations==1.3.0
opencv-contrib-python==4.3.0.36
opencv-python==4.5.1.48
imageio==2.9.0
imageio-ffmpeg==0.4.2
pytorch-lightning==1.5.0
Expand Down
Loading