Skip to content

yusufesatt/data-management

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Management Tools

This repository contains various tools for data management and organization.

It includes tools that facilitate various tasks such as performing operations on data sets, cleaning empty data, calculating class counts, etc.

Tools

1. Empty Data Remover

This tool deletes both the image and the label files of the data that have empty label txt's in the dataset.

Sample folder schema:

Warning! : In the root folder, except for train, test, and valid folders, only txt and yaml files can be found, do not add files with different extensions.

├─data
    ├─train
        ├─images
        ├─labels
    ├─test
        ├─images
        ├─labels
    ├─valid
        ├─images
        ├─labels

Use Case:

Note: Currently only works for data in txt format.
python empty_data_remover.py --input_path path/to/your/root/folder

2. Txt Class Counter

This project calculates how many annotations there are for each class by reading the txt files in the specified folder.

This allows you to understand whether your data is balanced or unbalanced.

Note: This tool only works for data in txt format but will soon support coco format

Use Case:

python txt_class_count.py --input_path path/to/your/dataset/train

Output:

Category ID: 1 Number of Annotations: 100
Category ID: 2 Number of Annotations: 200
Category ID: 3 Number of Annotations: 300
Total number of annotations: 600

Contributing

If you encounter issues or have suggestions for improvements, please report them on the GitHub repository 🚀.

License

This project is licensed under the MIT License.

About

The tools I use for data management

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages