-
Notifications
You must be signed in to change notification settings - Fork 47
Refactoring dataset and model classes #312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The structure looks good to me. The only thing that worries me a bit is the amount of classes per module. Maybe we could split dataset and model modules. E.g.:
detectionmetrics
|-datasets
| |-perception.py
| |-segmentation.py
| |-detection.py
| |...
|-models
|-perception.py
|-segmentation.py
|-detection.py
|...
What do you think? I don't have a strong opinion.
In any case, keep up the good work! 😄
Thanks for the feedback! That makes sense to me as well, I'll make the changes according to that. Could you also review the functions for detection classes, anything I missed. And after this I am thinking defining coco.py could be a potential next step. What do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll have to update all references to the old classes in the repo before merging, but looking great now 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've finally had time to review the PR in depth. I've found some issues that need to be addressed, mainly:
- Fix model/dataset ontologies format mismatch (stick to
{"class_name_a": {idx: 0, rgb: [0, 0, 0]}, "class_name_b": {idx: 1, rgb: [0, 0, 0]}}, ...
format) - Avoid doing any ontology conversion in the example notebook. The evaluated model output already matches the COCO ontology definition.
- Apply black formatter to all updated files.
Another minor thing that I'd recommend is removing the "mAP" column in the results dataframe as it is redundant. "mAP" could be reported in column "mean".
Despite these issues, the code is clear and well structured. Great job so far! Let's fix these minor issues and merge the PR 😄
PD: I've pushed to this branch https://github.com/JdeRobot/DetectionMetrics/tree/dph/object_detection the changes I've done to make the notebook example work (I'm getting a 46.6 mAP for COCO validation set, which seems reasonable)
app.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should keep the UI out from this PR. Changes should be pushed to a new branch linked to #243
rows.append({ | ||
"image": img_info["file_name"], | ||
"annotation": str(img_id), | ||
"split": "train" # Default split - could be enhanced to read from COCO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of now, we only support loading a single "split", which should be enough for now., but let's add a split parameter to the CocoDataset init so that we can at least fill in this value appropriately
detectionmetrics/datasets/coco.py
Outdated
for cat in coco.loadCats(coco.getCatIds()): | ||
ontology[cat["name"]] = { | ||
"idx": cat["id"], | ||
"name": cat["name"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"name" is already present as the key of the ontology dictionary. Let's remove the "name" property to avoid redundancy. Also, in segmentation datasets the format is {"class_name_a": {idx: 0, rgb: [0, 0, 0]}, "class_name_b": {idx: 1, rgb: [0, 0, 0]}}
, so it would be nice to have the same structure.
) | ||
|
||
# Init metrics | ||
metrics_factory = um.DetectionMetricsFactory(iou_threshold=0.5,num_classes=self.n_classes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beware! IoU threshold is hardcoded here, it should be read from the model config right?
Streamlit = "1.46.0" | ||
streamlit-image-select = "^0.6.0" | ||
supervision = "^0.18.0" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these requirement updates should go in a new branch linked to #243
tabs/dataset_viewer.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same for all GUI related files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the example notebook, I've found a couple of things that should be addressed:
- Model ontology format differs from the dataset one. It should keep the same structure. As mentioned in previous comment, the most appropriate format would be
{"class_name_a": {idx: 0, rgb: [0, 0, 0]}, "class_name_b": {idx: 1, rgb: [0, 0, 0]}}
. - There should be no ontology conversion in this case. The model outputs the indices as expected by the COCO dataset format already. I've been getting wrong results and after making sure no ontology conversion is applied they seem to be fixed.
No description provided.