Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Smarter Handling of Image Data Format #8227

Closed
cancan101 opened this issue Mar 9, 2017 · 13 comments
Closed

[feature] Smarter Handling of Image Data Format #8227

cancan101 opened this issue Mar 9, 2017 · 13 comments
Labels
stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author stat:contribution welcome Status - Contributions welcome type:feature Feature requests

Comments

@cancan101
Copy link
Contributor

cancan101 commented Mar 9, 2017

Right now the responsibility of choosing image data format (i.e. the representation of batches of image) is that of the data scientist (ie model writer). I suggest there should a solution in TF to move this to the optimizer (XLA perhaps?) or worst cast Op writer.

For some background:

Currently Tensorflow supports NCHW and NHWC (though other formats like CHWN might be possible down the road). Many of the Ops support both formats. That being said, the docs say:

The best practice is to build models that work with both NCHW and NHWC as it is common to train using NCHW on GPU, and then do inference with NHWC on CPU.

This requires the user to have to do some "wrangling" (e.g. loading the checkpoint of weights and re-buidling graph in Python) to map from one image format to another. Further this must be done with some knowledge of the platform on which the graph will be executed (ie which ops are defined, and if both, which is faster)?

Right now model builder must build the model to take in channel order and pass that around. Ideally the model could be written once with enough meta information attached to the graph to allow optimizers after the fact (ie at inference time on other platforms) to choose the best representation. Further even at training time, it would be great if the data scientist didn't need to be concerned with image data format (ie dimension ordering) and could use abstractions for accessing results that took care of data access.

I don't have a clear proposal of how to clean this up, but this seems like a potential pain point, or a the very least results in people leaving performance on the table both when training and at inference.

TL; DR Many data scientists just want to write CNNs without thinking about tensor layouts.

@prb12 prb12 added the type:feature Feature requests label Mar 9, 2017
@prb12
Copy link
Member

prb12 commented Mar 9, 2017

I agree.

It is actually the case that internally XLA is free to permute the physical layout order of Tensor dimensions to improve speed. However, when tensors flow in and out of XLA code from regular TensorFlow ops the data needs to be in row-major layout.

The visible data_layout fields of Conv ops, etc. are an unfortunate artifact of the NVidia cuDNN library interface, and the way it is supported in TensorFlow. Theoretically it ought to be possible to write a TensorFlow GraphOptimizationPass to analyze the user model (written with canonical layouts) and make the appropriate transformations.

@prb12 prb12 added the stat:contribution welcome Status - Contributions welcome label Mar 9, 2017
@persiyanov
Copy link
Contributor

persiyanov commented Mar 9, 2017

@cancan101 @prb12 Hi! Is there someone working on this already? If not, I would like to take part in implementation of this feature (yeah, I know this is the serious issue, not like my first one #7948). Could you push me in right direction?

As I understood, the point is this:
• We know that NCHW is better for training on GPU, but NHWC is better for inference on CPU
• But we don't want to force user to think about these details.
• User wants to build one model with one data_format and use it both for inference and training.
• What we need to do is still run Cudnn ops with NCHW format and cpu ops with NHWC, so we want to do appropriate reshapes.

@prb12 Could you please elaborate more on your idea with GraphOptimizationPass?

@taion
Copy link
Contributor

taion commented Mar 9, 2017

In an ideal world, I think we'd like to not specify data_format for the model at all. I'd like to just tag my input batch with some data format, and have the model internally run transpose operations as needed to use the fastest data format for the given platform... but this is going to be a tradeoff between the cost of transposing and the benefit of using a more optimal data format, if certain ops only support certain data formats.

@persiyanov
Copy link
Contributor

persiyanov commented Mar 9, 2017

I also had a thought about that, but I think that's the lesser issue than bothering about creating model twice for training and inference. In most cases, data is in one of these two formats, or can be reshaped in proper format on the fly outside tensorflow.

p.s. Maybe I am wrong and most experienced computer vision researchers/developers have other opinion.

@taion
Copy link
Contributor

taion commented Mar 9, 2017

Inference-time models need to be different from training-time models regardless, no? You want to build them with training or is_training set to False to drop unnecessary stuff for batch norm, dropout, &c.

I guess these can be handled in optimization passes, but at least that's not how we're handling it right now. We rebuild our graphs for inference with training=False before passing them through optimization.

@tumusudheer
Copy link

Hi @cancan101 ,

I'm facing similar problem as you mentioned in your initial post in this thread. I've a trained checkpoint model trained using NCHW format. But I want to convert into a checkpoint with weights so that the model works for NHWC format input images. May I know how to do it. You've mentioned
' This requires the user to have to do some "wrangling" (e.g. loading the checkpoint of weights and re-buidling graph in Python) to map from one image format to another.'

I know how to load a checkpoint file but how do I rebuild the graph for different image input and save a new checkpoint model for the new graph ? It would be great if you can provide some help or any sample/pseudo code.

Thanks in advance !!!

@groomsb3
Copy link

At a basic level, understanding image formats helps you to maintain image quality. Whether an image is just for yourself or destined to be seen by others, you’ll clearly want it to look as good as possible – particularly if it’s designed to represent your brand.

It’s also important as some image types aren’t as widely supported as others, and so they won’t be usable or viewable everywhere. Furthermore, some files support certain features that allow them to be displayed correctly, or edited at a later date, while others may not.
Another important reason concerns efficiency. Understanding file formats means understanding files sizes, which helps you to send and store images efficiently. This also affects load times for your website’s pages, which has a knock-on effect on both user experience and SEO.
In this article, we’ll run through the main image formats used today, and explain their pros and cons, before we take a look at the things you should bear in mind when converting an image from one format to another.

@github-actions
Copy link

This issue is stale because it has been open for 180 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Apr 20, 2023
@mohini83
Copy link

is this issue still open?

@google-ml-butler google-ml-butler bot removed the stale This label marks the issue/pr stale - to be closed automatically if no activity label Oct 10, 2023
Copy link

github-actions bot commented Apr 8, 2024

This issue is stale because it has been open for 180 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Apr 8, 2024
@tilakrayal
Copy link
Contributor

Hi,

Thank you for opening this issue. Since this issue has been open for a long time, the code/debug information for this issue may not be relevant with the current state of the code base.

The Tensorflow team is constantly improving the framework by fixing bugs and adding new features. We suggest you try the latest TensorFlow version with the latest compatible hardware configuration which could potentially resolve the issue. If you are still facing the issue, please create a new GitHub issue with your latest findings, with all the debugging information which could help us investigate.

Please follow the release notes to stay up to date with the latest developments which are happening in the Tensorflow space.

@google-ml-butler google-ml-butler bot removed the stale This label marks the issue/pr stale - to be closed automatically if no activity label May 27, 2024
@tilakrayal tilakrayal added the stat:awaiting response Status - Awaiting response from author label May 27, 2024
Copy link

github-actions bot commented Jun 4, 2024

This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Jun 4, 2024
Copy link

This issue was closed because it has been inactive for 7 days since being marked as stale. Please reopen if you'd like to work on this further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author stat:contribution welcome Status - Contributions welcome type:feature Feature requests
Projects
None yet
Development

No branches or pull requests

8 participants