Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PaddleOCR ch_PP-OCRv4_rec_hgnet 在读取数据时出现list index out of range,但是以前的版本并不会 #10798

Closed
dhhcj opened this issue Sep 1, 2023 · 7 comments
Assignees

Comments

@dhhcj
Copy link

dhhcj commented Sep 1, 2023

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

  • 系统环境/System Environment:
  • 版本号/Version:Paddle: PaddleOCR: 问题相关组件/Related components:
  • 运行指令/Command Code:
  • 完整报错/Complete Error Message:

我们提供了AceIssueSolver来帮助你解答问题,你是否想要它来解答(请填写yes/no)?/We provide AceIssueSolver to solve issues, do you want it? (Please write yes/no):

请尽量不要包含图片在问题中/Please try to not include the image in the issue.
这个是V4版本:
image
旧的V2版本:
image
并不会出现这个问题

@dhhcj
Copy link
Author

dhhcj commented Sep 1, 2023

image
是在65行报的这个问题,我记得这两个map是db检测模型的呀

@walasad
Copy link

walasad commented Sep 1, 2023

Same problem here, could you please share your .yml and train_data folder structure

@iweirman
Copy link

iweirman commented Sep 6, 2023

看样子是由于V4文字检测模型配置文件引入的一个问题。

应该是引入了动态shrink ratio(dynamic shrink ratio):在训练中,shrink ratio由固定值调整为动态变化,随着训练epoch的增加,shrink ratio从0.4线性增加到0.6。该策略在PP-OCRv4学生检测模型上,hmean从76.97%提升到78.24%。

    - MakeBorderMap:
        shrink_ratio: 0.4
        thresh_min: 0.3
        thresh_max: 0.7
        total_epoch: *epoch_num
    - MakeShrinkMap:
        shrink_ratio: 0.4
        min_text_size: 8
        total_epoch: *epoch_num

应该不会影响识别效果? @dhhcj 大佬,你测试过这个模型效果了吗?

@dhhcj
Copy link
Author

dhhcj commented Sep 7, 2023

Same problem here, could you please share your .yml and train_data folder structure

official ch_PP-OCRv4_rec_hgnet.yml, just change the default label_file_list and pretrained_model
Datastrucetue same as the previous ppocr V2/v3

@dhhcj
Copy link
Author

dhhcj commented Sep 7, 2023

看样子是由于V4文字检测模型配置文件引入的一个问题。

应该是引入了动态shrink ratio(dynamic shrink ratio):在训练中,shrink ratio由固定值调整为动态变化,随着训练epoch的增加,shrink ratio从0.4线性增加到0.6。该策略在PP-OCRv4学生检测模型上,hmean从76.97%提升到78.24%。

    - MakeBorderMap:
        shrink_ratio: 0.4
        thresh_min: 0.3
        thresh_max: 0.7
        total_epoch: *epoch_num
    - MakeShrinkMap:
        shrink_ratio: 0.4
        min_text_size: 8
        total_epoch: *epoch_num

应该不会影响识别效果? @dhhcj 大佬,你测试过这个模型效果了吗?

训练倒是一直都能训就是有一些warning 看着就很烦
image
测试也没啥问题

@iweirman
Copy link

iweirman commented Sep 7, 2023

emm,训练的时候直接注释掉得了,只要不影响效果就好。

@futureisatyourhand
Copy link

看样子是由于V4文字检测模型配置文件引入的一个问题。

应该是引入了动态shrink ratio(dynamic shrink ratio):在训练中,shrink ratio由固定值调整为动态变化,随着训练epoch的增加,shrink ratio从0.4线性增加到0.6。该策略在PP-OCRv4学生检测模型上,hmean从76.97%提升到78.24%。

    - MakeBorderMap:
        shrink_ratio: 0.4
        thresh_min: 0.3
        thresh_max: 0.7
        total_epoch: *epoch_num
    - MakeShrinkMap:
        shrink_ratio: 0.4
        min_text_size: 8
        total_epoch: *epoch_num

应该不会影响识别效果? @dhhcj 大佬,你测试过这个模型效果了吗?

你好,请问你有试过动态shrink么,真的有提点么?

@PaddlePaddle PaddlePaddle locked and limited conversation to collaborators May 25, 2024
@SWHL SWHL converted this issue into discussion #12239 May 25, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants