Skip to content

Refactoring dataset and model classes #312

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

SakhinetiPraveena
Copy link
Collaborator

No description provided.

@SakhinetiPraveena SakhinetiPraveena marked this pull request as draft June 6, 2025 05:16
Copy link
Collaborator

@dpascualhe dpascualhe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The structure looks good to me. The only thing that worries me a bit is the amount of classes per module. Maybe we could split dataset and model modules. E.g.:

detectionmetrics
|-datasets
|    |-perception.py
|    |-segmentation.py
|    |-detection.py
|    |...
|-models
     |-perception.py
     |-segmentation.py
     |-detection.py
     |...

What do you think? I don't have a strong opinion.

In any case, keep up the good work! 😄

@SakhinetiPraveena
Copy link
Collaborator Author

Thanks for the feedback! That makes sense to me as well, I'll make the changes according to that. Could you also review the functions for detection classes, anything I missed.

And after this I am thinking defining coco.py could be a potential next step. What do you think?

Copy link
Collaborator

@dpascualhe dpascualhe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll have to update all references to the old classes in the repo before merging, but looking great now 😄

@SakhinetiPraveena SakhinetiPraveena marked this pull request as ready for review July 12, 2025 18:22
Copy link
Collaborator

@dpascualhe dpascualhe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've finally had time to review the PR in depth. I've found some issues that need to be addressed, mainly:

  • Fix model/dataset ontologies format mismatch (stick to {"class_name_a": {idx: 0, rgb: [0, 0, 0]}, "class_name_b": {idx: 1, rgb: [0, 0, 0]}}, ... format)
  • Avoid doing any ontology conversion in the example notebook. The evaluated model output already matches the COCO ontology definition.
  • Apply black formatter to all updated files.

Another minor thing that I'd recommend is removing the "mAP" column in the results dataframe as it is redundant. "mAP" could be reported in column "mean".

Despite these issues, the code is clear and well structured. Great job so far! Let's fix these minor issues and merge the PR 😄

PD: I've pushed to this branch https://github.com/JdeRobot/DetectionMetrics/tree/dph/object_detection the changes I've done to make the notebook example work (I'm getting a 46.6 mAP for COCO validation set, which seems reasonable)

app.py Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should keep the UI out from this PR. Changes should be pushed to a new branch linked to #243

rows.append({
"image": img_info["file_name"],
"annotation": str(img_id),
"split": "train" # Default split - could be enhanced to read from COCO
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of now, we only support loading a single "split", which should be enough for now., but let's add a split parameter to the CocoDataset init so that we can at least fill in this value appropriately

for cat in coco.loadCats(coco.getCatIds()):
ontology[cat["name"]] = {
"idx": cat["id"],
"name": cat["name"],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"name" is already present as the key of the ontology dictionary. Let's remove the "name" property to avoid redundancy. Also, in segmentation datasets the format is {"class_name_a": {idx: 0, rgb: [0, 0, 0]}, "class_name_b": {idx: 1, rgb: [0, 0, 0]}}, so it would be nice to have the same structure.

)

# Init metrics
metrics_factory = um.DetectionMetricsFactory(iou_threshold=0.5,num_classes=self.n_classes)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beware! IoU threshold is hardcoded here, it should be read from the model config right?

Comment on lines +26 to +29
Streamlit = "1.46.0"
streamlit-image-select = "^0.6.0"
supervision = "^0.18.0"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these requirement updates should go in a new branch linked to #243

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for all GUI related files

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the example notebook, I've found a couple of things that should be addressed:

  • Model ontology format differs from the dataset one. It should keep the same structure. As mentioned in previous comment, the most appropriate format would be {"class_name_a": {idx: 0, rgb: [0, 0, 0]}, "class_name_b": {idx: 1, rgb: [0, 0, 0]}}.
  • There should be no ontology conversion in this case. The model outputs the indices as expected by the COCO dataset format already. I've been getting wrong results and after making sure no ontology conversion is applied they seem to be fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants