Merge pull request #399 from mrhan1993/dev

Update Fooocus to 2.5.3
mrhan1993 · Aug 15, 2024 · 875c9b2 · 875c9b2
2 parents efc73d0 + c64d672
commit 875c9b2
Show file tree

Hide file tree

Showing 121 changed files with 43,392 additions and 1,839 deletions.
diff --git a/.gitignore b/.gitignore
@@ -34,6 +34,7 @@ config.txt
 config_modification_tutorial.txt
 user_path_config.txt
 user_path_config-deprecated.txt
+hash_cache.txt
 
 sorted_styles.json
 /presets

diff --git a/README.md b/README.md
@@ -19,16 +19,65 @@
 - [License](#license)
 - [Thanks :purple\_heart:](#thanks-purple_heart)
 
-If this is the first time to use it, it is recommended to use a rewritten new project [FooocusAPI](https://github.com/mrhan1993/FooocusAPI)
+> Note:
+>
+> Fooocus 2.5 includes a significant update, with most dependencies upgraded. Therefore, after updating, do not use `--skip-pip` unless you have already performed a manual update.
+>
+> Additionally, `groundingdino-py` may encounter installation errors, especially in Chinese Windows environments. The solution can be found in the following [issue](https://github.com/IDEA-Research/GroundingDINO/issues/206).
+
+
+> GenerateMask is same as DescribeImage, It is not process as a task, result will directly return
+
+# Instructions for Using the ImageEnhance Interface
+Below are examples of parameters that include the main parameters required for ImageEnhance. The V1 interface adopts a form-like approach similar to ImagePrompt to break down the enhance controller.
+
+
+```python
+{
+  "enhance_input_image": "",
+  "enhance_checkbox": true,
+  "enhance_uov_method": "Vary (Strong)",
+  "enhance_uov_processing_order": "Before First Enhancement",
+  "enhance_uov_prompt_type": "Original Prompts",
+  "save_final_enhanced_image_only": true,
+  "enhance_ctrlnets": [
+    {
+      "enhance_enabled": false,
+      "enhance_mask_dino_prompt": "face",
+      "enhance_prompt": "",
+      "enhance_negative_prompt": "",
+      "enhance_mask_model": "sam",
+      "enhance_mask_cloth_category": "full",
+      "enhance_mask_sam_model": "vit_b",
+      "enhance_mask_text_threshold": 0.25,
+      "enhance_mask_box_threshold": 0.3,
+      "enhance_mask_sam_max_detections": 0,
+      "enhance_inpaint_disable_initial_latent": false,
+      "enhance_inpaint_engine": "v2.6",
+      "enhance_inpaint_strength": 1,
+      "enhance_inpaint_respective_field": 0.618,
+      "enhance_inpaint_erode_or_dilate": 0,
+      "enhance_mask_invert": false
+    }
+  ]
+}
+```
+
+- enhance_input_image: The image to be enhanced, which is required and can be provided as an image URL for the V2 interface.
+- enhance_checkbox: A toggle switch that must be set to true if you want to use the enhance image feature.
+- save_final_enhanced_image_only: Since image enhancement is a pipeline operation, it can produce multiple result images. This parameter allows you to only return the final enhanced image.
+
+There are three parameters related to UpscaleVary, which are used to perform Upscale or Vary before or after enhancement.
 
-A migration guide is provided [here](./docs/migrate.md).
+- enhance_uov_method: Similar to the UpscaleOrVary interface, Disabled turns it off.
+- enhance_uov_processing_order: Determines whether to process the image before or after enhancement.
+- enhance_uov_prompt_type: I'm not sure about the specific function; you might want to research it based on the WebUI.
 
-# :warning: Compatibility warning :warning:
+The `enhance_ctrlnets` element is a list of ImageEnhance controller objects, with a maximum of three elements in the list, any additional elements will be discarded. The parameters correspond roughly to the WebUI, and the notable parameters are:
 
-When upgrading from version 3.x to version 4.0, please read the following incompatibility notes:
+- enhance_enabled: This parameter controls whether the enhance controller is active. If there are no enabled enhance controllers, the task will be skipped.
+- enhance_mask_dino_prompt: This parameter is required and indicates the area to be enhanced. If it is empty, even if the enhance controller is enabled, the task will be skipped.
 
-1. If you are using an external Fooocus model (that is, the model is not located in the `repositories` directory), delete the `repositories` directory directly, and then update the `git pull`.
-2. If not, move the `repositories` directory to any directory, delete the `repositories` directory, then update the `git pull`, and move the `models` directory back to its original location when it is finished.
 
 # Introduction
 

diff --git a/README_zh.md b/README_zh.md
@@ -19,22 +19,70 @@
 - [License](#license)
 - [感谢 :purple\_heart:](#感谢-purple_heart)
 
-如果是第一次使用，推荐使用重写后的新项目 [FooocusAPI](https://github.com/mrhan1993/FooocusAPI)
 
-我还准备了一个[迁移指南](./docs/migrate_zh.md)
+> 注意：
+>
+> Fooocus 2.5 包含大量更新，其中多数依赖进行了升级，因此，更新后请不要使用 `--skip-pip`. 除非你已经进行过手动更新
+>
+> 此外, `groundingdino-py` 可能会遇到安装错误, 特别是在中文 windows 环境中, 解决办法参考: [issues](https://github.com/IDEA-Research/GroundingDINO/issues/206)
+
+> 和 DescribeImage 一样，GenerateMask 不会作为 task 处理而是直接返回结果
+
+# ImageEnhance 接口的使用说明
+
+以下面的参数为例，它包含了 ImageEnhance 所需要的主要参数，V1 接口采用和 ImagePrompt 类似的方式将 enhance 控制器拆分成表单形式：
+
+```python
+{
+  "enhance_input_image": "",
+  "enhance_checkbox": true,
+  "enhance_uov_method": "Vary (Strong)",
+  "enhance_uov_processing_order": "Before First Enhancement",
+  "enhance_uov_prompt_type": "Original Prompts",
+  "save_final_enhanced_image_only": true,
+  "enhance_ctrlnets": [
+    {
+      "enhance_enabled": false,
+      "enhance_mask_dino_prompt": "face",
+      "enhance_prompt": "",
+      "enhance_negative_prompt": "",
+      "enhance_mask_model": "sam",
+      "enhance_mask_cloth_category": "full",
+      "enhance_mask_sam_model": "vit_b",
+      "enhance_mask_text_threshold": 0.25,
+      "enhance_mask_box_threshold": 0.3,
+      "enhance_mask_sam_max_detections": 0,
+      "enhance_inpaint_disable_initial_latent": false,
+      "enhance_inpaint_engine": "v2.6",
+      "enhance_inpaint_strength": 1,
+      "enhance_inpaint_respective_field": 0.618,
+      "enhance_inpaint_erode_or_dilate": 0,
+      "enhance_mask_invert": false
+    }
+  ]
+}
+```
+
+- enhance_input_image：需要增强的图像，如果是 v2 接口，可以提供一个图像 url，必选
+- enhance_checkbox：总开关，使用 enhance image 必须设置为 true
+- save_final_enhanced_image_only：图像增强是一个管道作业，因此会产生多个结果图像，使用该参数仅返回最终图像
+
+有三个和 UpscaleVary 相关的参数，其作用是执行增强之前或完成增强之后执行 Upscale 或 Vary
 
-# :warning: 兼容性警告 :warning:
+- enhance_uov_method：和 UpscaleOrVary 接口一样，Disabled 是关闭
+- enhance_uov_processing_order：在增强之前处理还是处理增强后的图像
+- enhance_uov_prompt_type：我也不知道具体作用，对着 WebUI 研究研究🧐
 
-如果是从 0.3.x 版本升级到 0.4.0 版本，请务必阅读以下兼容性说明：
+`enhance_ctrlnets` 元素为 ImageEnhance 控制器对象列表，该列表最多包含 3 个元素，多余会被丢弃。参数和 WebUI 基本一一对应，需要注意的参数是：
 
-1. 如果你使用的是外部 Fooocus 模型（即模型不是位于 `repositories/Fooocus/models` 目录下），直接删除 `repositories` 目录，然后执行 `git pull` 更新即可
-2. 如果不是上述方式，将 `repositories/Fooocus/models` 目录移动到任意目录，删除 `repositories` 目录，然后执行 `git pull` 更新，完成后将 `models` 目录移动回原位置
+- enhance_enabled：参数控制该 enhance 控制器是否工作，如果没有开启的 enhance 控制器，任务会被跳过
+- enhance_mask_dino_prompt：该参数必选，表示需要增强的部位，如果该参数为空，即便 enhance 控制器处于开启状态，也会跳过
 
 # 简介
 
 使用 FastAPI 构建的 [Fooocus](https://github.com/lllyasviel/Fooocus) 的 API。
 
-当前支持的 Fooocus 版本: [2.3.1](https://github.com/lllyasviel/Fooocus/blob/main/update_log.md)。
+当前支持的 Fooocus 版本: [2.5.3](https://github.com/lllyasviel/Fooocus/blob/main/update_log.md)。
 
 ## Fooocus
 

diff --git a/docs/change_logs.md b/docs/change_logs.md
@@ -1,9 +1,13 @@
 # ChangeLog for Fooocus-API
 
-## [UNRELEASE]
+## [v0.5.0.1]
 ### Changed
+- Fooocus to v2.5.3
+- Add enhance image endpoint
+- Add generate mask endpoint
+- Influenced by Fooocus, the worker.py was reconstructed
 - Update docs
-- Returnd base64 str now include identifier like this `data:image/jpeg;base64,`
+- Returned base64 str now include identifier like this `data:image/jpeg;base64,`
 
 ### Fixed
 - Issue #375

diff --git a/docs/change_logs_zh.md b/docs/change_logs_zh.md
@@ -1,7 +1,11 @@
 # ChangeLog for Fooocus-API
 
-## [UNRELEASE]
+## [v0.5.0.1]
 ### Changed
+- 合并到 Fooocus 2.5.3
+- 增加图像增强接口
+- 增加遮罩生成接口
+- 受 Fooocus 影响，重构了 worker.py
 - 更新文档
 - 返回数据中的 base64 字符串现在包含图像标识符，比如 `data:image/jpeg;base64,`
 

diff --git a/examples/examples.ipynb b/examples/examples.ipynb
@@ -9,22 +9,14 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": null,
    "metadata": {
     "ExecuteTime": {
      "end_time": "2024-04-04T10:19:57.369099Z",
      "start_time": "2024-04-04T10:19:57.328298Z"
     }
    },
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "{'job_id': 'b53d8776-62a6-4398-bb8f-8118dc57403a', 'job_type': 'Text to Image', 'job_stage': 'WAITING', 'job_progress': 0, 'job_status': None, 'job_step_preview': None, 'job_result': None}\n"
-     ]
-    }
-   ],
+   "outputs": [],
    "source": [
     "import requests\n",
     "import json\n",
@@ -43,7 +35,6 @@
     "    return response.json()\n",
     "\n",
     "result =text2img({\n",
-    "    \"prompt\": \"1girl sitting on the ground\",\n",
     "    \"performance_selection\": \"Lightning\",\n",
     "    \"async_process\": True\n",
     "})\n",
@@ -84,7 +75,7 @@
     "result =upscale_vary(\n",
     "    image=image,\n",
     "    params={\n",
-    "        \"uov_method\": \"Upscale (2x)\",\n",
+    "        \"uov_method\": \"Vary\",\n",
     "        \"async_process\": True\n",
     "    })\n",
     "print(json.dumps(result, indent=4, ensure_ascii=False))"
@@ -352,8 +343,8 @@
     "# image_prompt v2 Interface example\n",
     "host = \"http://127.0.0.1:8888\"\n",
     "image = open(\"./imgs/bear.jpg\", \"rb\").read()\n",
-    "source = open(\"./imgs/s.jpg\", \"rb\").read()\n",
-    "mask = open(\"./imgs/m.png\", \"rb\").read()\n",
+    "source = open(\"./imgs/inpaint_source.jpg\", \"rb\").read()\n",
+    "mask = open(\"./imgs/inpaint_mask.png\", \"rb\").read()\n",
     "\n",
     "def image_prompt(params: dict) -> dict:\n",
     "    \"\"\"\n",
@@ -522,7 +513,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.10.10"
+   "version": "3.10.13"
   }
  },
  "nbformat": 4,

diff --git a/examples/examples_v1.py b/examples/examples_v1.py
@@ -228,3 +228,39 @@ def image_prompt(
     cn_img2=ImageList.image_prompt_1,
 )
 print(json.dumps(ip_result))
+
+# ###############################################################
+# Image Enhance
+# ################################################################
+
+# Image Enhance
+
+import requests
+
+url = "http://localhost:8888/v1/generation/image-enhance"
+
+# Define the file path and other form data
+file_path = "./examples/imgs/source_face_man.png"
+form_data = {
+    "enhance_checkbox": True,
+    "enhance_uov_method": "Disabled",
+    "enhance_enabled_1": True,
+    "enhance_mask_dino_prompt_1": "face",
+    "enhance_enabled_2": True,
+    "enhance_mask_dino_prompt_2": "eyes",
+}
+
+# Open the file and prepare it for the request
+with open(file_path, "rb") as f:
+    image = f.read()
+    f.close()
+
+# Send the request
+response = requests.post(
+    url,
+    files={"enhance_input_image": image},
+    data=form_data,
+    timeout=180)
+
+# Print the response content
+print(response.text)
diff --git a/examples/examples_v2.py b/examples/examples_v2.py
@@ -230,3 +230,59 @@ def text2image_image_prompt(params: dict) -> dict:
 
 t2i_ip_result = text2image_image_prompt(params=t2i_ip_params)
 print(json.dumps(t2i_ip_result))
+
+# ################################################################
+# Image Enhance
+# ################################################################
+
+# Image Enhance
+
+import requests
+import json
+
+url = "http://localhost:8888/v2/generation/image-enhance"
+
+headers = {
+    "Content-Type": "application/json"
+}
+
+data = {
+    "enhance_input_image": "https://github.com/mrhan1993/Fooocus-API/main/examples/imgs/source_face_man.png",
+    "enhance_checkbox": True,
+    "enhance_uov_method": "Vary (Strong)",
+    "enhance_uov_processing_order": "Before First Enhancement",
+    "enhance_uov_prompt_type": "Original Prompts",
+    "enhance_ctrlnets": [
+        {
+            "enhance_enabled": True,
+            "enhance_mask_dino_prompt": "face",
+            "enhance_prompt": "",
+            "enhance_negative_prompt": "",
+            "enhance_mask_model": "sam",
+            "enhance_mask_cloth_category": "full",
+            "enhance_mask_sam_model": "vit_b",
+            "enhance_mask_text_threshold": 0.25,
+            "enhance_mask_box_threshold": 0.3,
+            "enhance_mask_sam_max_detections": 0,
+            "enhance_inpaint_disable_initial_latent": False,
+            "enhance_inpaint_engine": "v2.6",
+            "enhance_inpaint_strength": 1.0,
+            "enhance_inpaint_respective_field": 0.618,
+            "enhance_inpaint_erode_or_dilate": 0.0,
+            "enhance_mask_invert": False
+        }
+    ]
+}
+
+response = requests.post(
+    url,
+    headers=headers,
+    data=json.dumps(data),
+    timeout=180)
+
+if response.status_code == 200:
+    print("Request successful!")
+    print("Response:", response.json())
+else:
+    print("Request failed with status code:", response.status_code)
+    print("Response:", response.text)
diff --git a/extras/BLIP/configs/bert_config.json b/extras/BLIP/configs/bert_config.json
@@ -0,0 +1,21 @@
+{
+  "architectures": [
+    "BertModel"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 0,
+  "type_vocab_size": 2,
+  "vocab_size": 30522,
+  "encoder_width": 768,
+  "add_cross_attention": true   
+}