PaddleOCR ch_PP-OCRv4_rec_hgnet 在读取数据时出现list index out of range，但是以前的版本并不会 #10798

dhhcj · 2023-09-01T06:02:53Z

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

系统环境/System Environment：
版本号/Version：Paddle： PaddleOCR：问题相关组件/Related components：
运行指令/Command Code：
完整报错/Complete Error Message：

我们提供了AceIssueSolver来帮助你解答问题，你是否想要它来解答(请填写yes/no)?/We provide AceIssueSolver to solve issues, do you want it? (Please write yes/no):

请尽量不要包含图片在问题中/Please try to not include the image in the issue.
这个是V4版本：

旧的V2版本：

并不会出现这个问题

dhhcj · 2023-09-01T06:41:43Z

是在65行报的这个问题，我记得这两个map是db检测模型的呀

walasad · 2023-09-01T20:41:02Z

Same problem here, could you please share your .yml and train_data folder structure

iweirman · 2023-09-06T11:47:39Z

看样子是由于V4文字检测模型配置文件引入的一个问题。

应该是引入了动态shrink ratio(dynamic shrink ratio):在训练中，shrink ratio由固定值调整为动态变化，随着训练epoch的增加，shrink ratio从0.4线性增加到0.6。该策略在PP-OCRv4学生检测模型上，hmean从76.97%提升到78.24%。

    - MakeBorderMap:
        shrink_ratio: 0.4
        thresh_min: 0.3
        thresh_max: 0.7
        total_epoch: *epoch_num
    - MakeShrinkMap:
        shrink_ratio: 0.4
        min_text_size: 8
        total_epoch: *epoch_num

应该不会影响识别效果？ @dhhcj 大佬，你测试过这个模型效果了吗？

dhhcj · 2023-09-07T03:09:09Z

Same problem here, could you please share your .yml and train_data folder structure

official ch_PP-OCRv4_rec_hgnet.yml, just change the default label_file_list and pretrained_model
Datastrucetue same as the previous ppocr V2/v3

dhhcj · 2023-09-07T03:10:52Z

看样子是由于V4文字检测模型配置文件引入的一个问题。

应该是引入了动态shrink ratio(dynamic shrink ratio):在训练中，shrink ratio由固定值调整为动态变化，随着训练epoch的增加，shrink ratio从0.4线性增加到0.6。该策略在PP-OCRv4学生检测模型上，hmean从76.97%提升到78.24%。
    - MakeBorderMap:
        shrink_ratio: 0.4
        thresh_min: 0.3
        thresh_max: 0.7
        total_epoch: *epoch_num
    - MakeShrinkMap:
        shrink_ratio: 0.4
        min_text_size: 8
        total_epoch: *epoch_num
应该不会影响识别效果？ @dhhcj 大佬，你测试过这个模型效果了吗？

训练倒是一直都能训就是有一些warning 看着就很烦

测试也没啥问题

iweirman · 2023-09-07T07:36:17Z

emm，训练的时候直接注释掉得了，只要不影响效果就好。

futureisatyourhand · 2024-02-19T11:31:38Z

看样子是由于V4文字检测模型配置文件引入的一个问题。

应该是引入了动态shrink ratio(dynamic shrink ratio):在训练中，shrink ratio由固定值调整为动态变化，随着训练epoch的增加，shrink ratio从0.4线性增加到0.6。该策略在PP-OCRv4学生检测模型上，hmean从76.97%提升到78.24%。
    - MakeBorderMap:
        shrink_ratio: 0.4
        thresh_min: 0.3
        thresh_max: 0.7
        total_epoch: *epoch_num
    - MakeShrinkMap:
        shrink_ratio: 0.4
        min_text_size: 8
        total_epoch: *epoch_num
应该不会影响识别效果？ @dhhcj 大佬，你测试过这个模型效果了吗？

你好，请问你有试过动态shrink么，真的有提点么？

iweirman mentioned this issue Sep 7, 2023

修改数据增强导致的DSR报错 #10662

Merged

paddle-bot bot assigned tink2123 Mar 8, 2024

PaddlePaddle locked and limited conversation to collaborators May 25, 2024

SWHL converted this issue into discussion #12239 May 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

PaddleOCR ch_PP-OCRv4_rec_hgnet 在读取数据时出现list index out of range，但是以前的版本并不会 #10798

PaddleOCR ch_PP-OCRv4_rec_hgnet 在读取数据时出现list index out of range，但是以前的版本并不会 #10798

dhhcj commented Sep 1, 2023

dhhcj commented Sep 1, 2023

walasad commented Sep 1, 2023

iweirman commented Sep 6, 2023 •

edited

Loading

dhhcj commented Sep 7, 2023

dhhcj commented Sep 7, 2023

iweirman commented Sep 7, 2023

futureisatyourhand commented Feb 19, 2024

This issue was moved to a discussion.

This issue was moved to a discussion.

PaddleOCR ch_PP-OCRv4_rec_hgnet 在读取数据时出现list index out of range，但是以前的版本并不会 #10798

PaddleOCR ch_PP-OCRv4_rec_hgnet 在读取数据时出现list index out of range，但是以前的版本并不会 #10798

Comments

dhhcj commented Sep 1, 2023

dhhcj commented Sep 1, 2023

walasad commented Sep 1, 2023

iweirman commented Sep 6, 2023 • edited Loading

dhhcj commented Sep 7, 2023

dhhcj commented Sep 7, 2023

iweirman commented Sep 7, 2023

futureisatyourhand commented Feb 19, 2024

This issue was moved to a discussion.

iweirman commented Sep 6, 2023 •

edited

Loading