-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use new non-standard FITS extension for ASDF-in-FITS #412
Conversation
Hi there @drdavella 👋 - thanks for the pull request! I'm just a friendly 🤖 that checks for issues related to the changelog. I help make sure that this pull request is milestoned and labeled correctly. This is mainly intended for the maintainers, so if you are not a maintainer you can ignore this, and a maintainer will let you know if any action is required on your part 😃. Everything looks good from my point of view! 👍 If there are any issues with this message, please report them here. |
a4c77e8
to
9d996c5
Compare
asdf/fits_embed.py
Outdated
|
||
try: | ||
asdf_extension = self._hdulist[ASDF_EXTENSION_NAME] | ||
except (KeyError, IndexError, AttributeError): | ||
self._hdulist.append(fits.ImageHDU(array, name=ASDF_EXTENSION_NAME)) | ||
self._hdulist.append(self._create_hdu(buff, use_image_hdu)) | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
array
is not defined here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. This was an oversight. I'm not sure it will be quite as simple as replacing the deleted line, but I'll add a unit test to exercise this case and go from there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As James said, I don't see why you deleted line 238 in fits_embed.py. It's used at line 245.
# Allow writing to old-style ImageHDU for backwards compatibility | ||
if use_image_hdu: | ||
array = np.frombuffer(buff.getvalue(), np.uint8) | ||
return fits.ImageHDU(array, name=ASDF_EXTENSION_NAME) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the use of the ASDF_EXTENSION_NAME value here going to cause the XTENSION keyword to still have a value of "ASDF", even though in this case we want it to be "IMAGE"? Or does "name" in this case set the value of the EXTNAME keyword?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the latter: the name
keyword argument will make EXTNAME
be ASDF
, but the XTENSION
will still be IMAGE
since we're using ImageHDU
.
However, this is the backwards compatible case. In the new case, both XTENSION
and EXTNAME
will be ASDF
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Howard, here's a comparison of a current DMS NIRISS rate file read in and written out under the new asdf code. Compare the last HDU of the DMS version and the version with the new code:
No. Name Ver Type Cards Dimensions Format
0 PRIMARY 1 PrimaryHDU 244 ()
1 SCI 1 ImageHDU 48 (2048, 2048) float32
2 ERR 1 ImageHDU 10 (2048, 2048) float32
3 DQ 1 ImageHDU 11 (2048, 2048) int32 (rescales to uint32)
4 VAR_POISSON 1 ImageHDU 9 (2048, 2048) float32
5 VAR_RNOISE 1 ImageHDU 9 (2048, 2048) float32
6 ASDF 1 ImageHDU 7 (5998,) uint8
---------------------------------------------------------------
No. Name Ver Type Cards Dimensions Format
0 PRIMARY 1 PrimaryHDU 244 ()
1 SCI 1 ImageHDU 48 (2048, 2048) float32
2 ERR 1 ImageHDU 10 (2048, 2048) float32
3 DQ 1 ImageHDU 11 (2048, 2048) int32 (rescales to uint32)
4 VAR_POISSON 1 ImageHDU 9 (2048, 2048) float32
5 VAR_RNOISE 1 ImageHDU 9 (2048, 2048) float32
6 ASDF 1 NonstandardExtHDU 8 ()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the lack of Dimensions and Format info for the new ASDF extension is due to the fact that fits.info no longer knows how to handle this type of extension?
And it looks like there's really only one change to a test in |
I just want to mention that this PR also explicitly "un-implements" the |
5e54fa3
to
a29dbd0
Compare
@drdavella - I was trying to test this out and I ran into a couple of oddities while doing so. It's not clear that it's related to these changes, but it's preventing me from testing this. Specifically, I follow the example from http://asdf.readthedocs.io/en/latest/asdf/examples.html#saving-asdf-in-fits (which looks to be unchanged in this PR) and I get a (What I was actually trying to do was look at the generated fits files without asdf installed to see how things behave. But can't do that without an example.) |
# Allow writing to old-style ImageHDU for backwards compatibility | ||
if use_image_hdu: | ||
array = np.frombuffer(buff.getvalue(), np.uint8) | ||
return fits.ImageHDU(array, name=ASDF_EXTENSION_NAME) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the lack of Dimensions and Format info for the new ASDF extension is due to the fact that fits.info no longer knows how to handle this type of extension?
@eteq ASDF can't be used from the source directory. Try installing the package and doing the same thing somewhere else. I can try to add a better error message for this for the user but I don't think we're going to support this for a variety of reasons that can be discussed elsewhere. |
@hbushouse I just pushed an update that will allow [edit] Actually it will only be automatically recognized as an cc @jdavies-st |
FWIW, it turns out the problem I had above was that I hadn't updated Now that I got it working, I have an observation of the behavior that I think might be a bit surprising. Supposing you have a file >>> from astropy.io import fits
>>> fits.open('embedded_asdf.fits')[-1]
<astropy.io.fits.hdu.base.NonstandardExtHDU at 0x10cd6df98>
>>> import asdf
>>> fits.open('embedded_asdf.fits')[-1]
<astropy.io.fits.hdu.base.NonstandardExtHDU at 0x111788780>
>>> from asdf import fits_embed
>>> fits.open('embedded_asdf.fits')[-1]
<asdf.fits_embed._AsdfHDU at 0x111793860> So the class that comes out depends in whether |
@eteq very good point, will push an update shortly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from my comment above, one more thought from an inline review. The rest looks good though!
('PCOUNT', 0, 'number of parameters'), | ||
('GCOUNT', 1, 'number of groups'), | ||
('COMPRESS', compress, 'Uses gzip compression'), | ||
('EXTNAME', cls._extension, 'Name of ASDF extension'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I read this right, the name is still hard-coded as ASDF
. Is that desirable? I would have thought this should allow multiple ASDF-type extensions, with arbitrary names. Or is this necessary for backwards compatibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Admittedly this is not the most generic design. However, since the AsdfInFits
class is expected to be the only user of this HDU class, and since astropy.io.fits
locates HDUs in an HDUList
based on EXTNAME
, it just made sense to set EXTNAME
to ASDF
here. In other words, this is a matter of convenience and an implementation detail of AsdfInFits
, but not necessarily part of the abstract ASDF HDU convention itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple ASDF extensions could be distinguished via unique EXTVER values, assuming that keyword makes sense in an ASDF extension?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be possible, but I would need to investigate more. Currently AsdfInFits
expects only a single ASDF extension to be present in a FITS file. Until the JWST pipeline has a use case for multiple ASDF extensions in the same file, I don't think it makes sense to support this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, well, I'm not necessarily attached to this specific line as the spot to change it. Are you thinking that if I am a user and want to have an ASDF-in-FITS HDU with the name "WCS_MODEL", I'd be expected to do something like
fits_embed.AsdfInFits(hdulist, tree)
hdulist['ASDF'].header['EXTNAME] = 'WCS_MODEL'
? I'm really just saying I think it would be useful to have the option to do something like fits_embed.AsdfInFits(hdulist, tree, name='WCS_MODEL')
, as the typical user is likely to see that as the main place where they set information about the hdu.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I think I just realized my mis-conception. I thought it was possible to have multiple asdf trees in one file (I see @drdavella pointed this out above, but I didn't see it until now...). I see now that the "reader" mode only supports a single one. I guess that makes this use case less important. Probably I should make a separate issue about that, but this concern about the naming could wait until then...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear: the ASDF HDU is used solely for storing a serialized version of the ASDF metadata tree. AsdfInFits
currently expects a one-to-one mapping between FITS files and ASDF trees. The actual data arrays, including any that are used for WCS models or any other purposes, are actually stored in conventional FITS HDUs. The details of this, including the assignment of any EXTNAME
are controlled completely by the user. This array data is accessible by any FITS reader.
So it doesn't really make sense to use anything other than ASDF
as the EXTNAME
for this HDU because the only thing that will ever actually process it is our AsdfInFits
code. Its just an implementation detail and anyone who reads the pure FITS file does not need to be concerned with it. Unless there becomes a compelling reason to have more than one ASDF metadata tree in the same FITS file, I don't think it makes sense to change anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, that clears things up. I withdraw my request 😉
@hbushouse, @jdavies-st, @bernie-simon, @nden I'll wait for your call on when this is okay to merge. Based on discussions with @jdavies-st it doesn't sound like this change is necessarily going to be catastrophic for the JWST regression suite, but I want to make sure everyone is ready for it when it gets merged. |
According to @nden there is a DMS delivery this week, so we will wait to merge until next week. |
@nden just wondering if merging this would be a good way to ring in the new year? |
@drdavella I would say go for it. Everything is pretty stable right now and has been since we did the tag/branch for DMS just before Christmas. No other major changes have been merged lately. |
The motivation for this update was a problem report from the archive. They pointed out that storing ASDF metadata in an
IMAGE
HDU goes against convention and would likely lead to incorrect handling of the data when the resulting file is processed by FITS readers.It was originally proposed to instead use the
FOREIGN
extension. However, this convention has already been defined, and it seemed too restrictive to have to conform to this convention, especially since there is not already an implementation inastropy
.Instead, it was agreed that we would create a new
ASDF
extension for storing ASDF metadata in FITS files. This PR contains the implementation of the corresponding HDU, which is nearly identical to theFitsHDU
inastropy.io.fits
. In this case, theXTENSION
keyword isASDF
. The ASDF library writes and expects to read extensions with anEXTNAME
ofASDF
.The
AsdfInFits
implementation has been updated to write new files using the new convention by default. However, it is still capable of reading older files using theIMAGE
extension, and it is still possible to optionally write files using theIMAGE
extension as well.I expect that these changes will break a lot of tests in the JWST pipeline, so I will make sure that everyone is on board before hitting the merge button.
This closes #336.