Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: as.ISO8601 #629

Closed
billdenney opened this issue Jan 29, 2018 · 7 comments
Closed

Feature Request: as.ISO8601 #629

billdenney opened this issue Jan 29, 2018 · 7 comments

Comments

@billdenney
Copy link
Contributor

as.ISO8601 methods would be useful for all the date, date/time, period, duration, and interval functions.

To make this as useful as possible, I think it would benefit to not use as.character and make a new as.ISO8601 generic function with arguments:

  • x: The object to convert
  • format: This would depend on what is being represented:
    • With dates, it could be "ymd" (year month day), "ywd" (year week day), "yd" (year day in YYYY-DDD format), "y" (year only), "m" (month only), "md" (month day only)
    • With times, it could be "hms", "hm", or "h"
    • With date/times, it could be any pair of the options for dates and times separated by a "T" (default "ymdThms"). I chose "T" rather than something else to align with the expected output as defined in the ISO standard.
    • Periods would share the format specifier with date, times, or date/times.
  • include_tz: if NULL (default), include/exclude the time zone from the date/time based on the input data. If FALSE, exclude the time zone from the representation, and if TRUE include the time zone in the representation. (It would be ignored if time zone does not apply to the output format.)
  • repeat: (default: NULL) For intervals, repeating could be specified by setting to a non-NULL, positive integer or Inf for indefinite, future repeats.
  • repeat_start: (default: NULL) For intervals, repeat start date/time. (Must be a date, time, or date/time object type.)

(As initially discussed in #362)

@vspinu
Copy link
Member

vspinu commented Jan 29, 2018

We should name it something else I guess. There is no class ISO8601, so as methods would be a misnomer. I think the format is not really needed because it's a conversion from lubridate objects to characters which is generally unambiguous. Not sure about repeat either. Lubridate's intervals don't support repetitions.

@billdenney
Copy link
Contributor Author

For a name, as.character_ISO8601? Or, we could put it into the as.character methods with an argument of format="ISO8601"?

For the format argument thoughts, while I personally wouldn't use them, there are different representations of the standard for dates, "YYYY-MM-DD" and "YYYY-WWW-D" and "YYYY-DDD". Those seem like real formats. For the other date formats, does lubridate have a way to represent a missing year with a specified month and day (I've not seen it, but maybe I'm missing something). If not, it can still be helpful to have that as an output format; if it is possible to represent, then I agree that using the simpler 1:1 character relationship to the object would make sense and can you please point me to the way to represent it?

For repeat, no lubridate's intervals don't support repetition, but that would be added information at the time of character conversion. If not preferred, those are pretty simple to add manually, so I don't feel strongly.

@vspinu
Copy link
Member

vspinu commented Jan 30, 2018

we could put it into the as.character methods with an argument of format="ISO8601"?

This can work but the drawback is that you cannot add methods to classes which you don't own (like difftime). Maybe format_iso or format_ISO8601, or just ISO8601?

For the other date formats, does lubridate have a way to represent a missing year with a specified month and day

Nope. There are two core date-time classes - Date and POSIX.

The final implementation should consider the parser - it should allow for the round trip. Currently not everything in ISO8601 is supported and implementing all those ISO tweaks and tricks is, quite frankly, out of scope.

here are different representations of the standard for dates, "YYYY-MM-DD" and "YYYY-WWW-D" and "YYYY-DDD".

With small effort anyone can write their own formatter with format directly. So I would suggest that we first aim at standard 1:1 representations and see if we really need anything else once that's done.

@billdenney
Copy link
Contributor Author

I like format_ISO8601. It's clear in both intent format and target ISO8601.

The final implementation should consider the parser - it should allow for the round trip. [...]

Perfectly fair. If I find myself needing something more esoteric, I may write an ISO8601 library. For now, I'm happier with it living in lubridate.

I'll just keep the x and include_tz arguments described in the original comment for this issue. I'll add a ... argument so that someone (maybe me) could later extend the generic in another package.

@vspinu
Copy link
Member

vspinu commented Jan 30, 2018

Maybe with_tz or add_tz to make the argument a bit shorter. If it's UTC then z should always be added I think.

@billdenney
Copy link
Contributor Author

with_tz sounds like a good name to me. I hesitate to always add the time zone if UTC or if it is the local time zone since those tend to be added accidentally or by default to a lot of datasets (at least ones that I receive).

@billdenney
Copy link
Contributor Author

I wrote this feature tonight. A few differences relative to our conversation above:

  1. Instead of with_tz, I named the argument usetz to align with the base R as.character method argument.
  2. I didn't do the additional formatting for adding Z instead of -0000 for UTC because that would have added a notable amount more code. I can do it, but the longer-term maintenance didn't seem like it was worthwhile. If you'd like Z, I can make that update.
  3. I added a precision argument to allow the user to request the output precision (I often need "ymdhm" without seconds). It would be a simple modification to make that formatting match the function naming (e.g. ymd_hm). Let me know if you have a preference there.
  4. An odd combination of precision="y", usetz=TRUE gives unusual results (https://github.com/billdenney/lubridate/blob/a3b13ec6813397abae6fff3792062e2be17299fc/tests/testthat/test-format_ISO8601.R#L36-L40). I left it as is because it's what the user is requesting-- even if it doesn't make much sense.

netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue May 30, 2021
Version 1.7.10
==============

### NEW FEATURES

* `fast_strptime()` and `parse_date_time2()` now accept multiple formats and apply them in turn

### BUG FIXES

* [#926](tidyverse/lubridate#926) Fix incorrect division of intervals by months involving leap years
* Fix incorrect skipping of digits during parsing of the `%z` format

Version 1.7.9.2
===============

### NEW FEATURES

* [#914](tidyverse/lubridate#914) New `rollforward()` function
* [#928](tidyverse/lubridate#928) On startup lubridate now resets TZDIR to a proper directory when it is set to non-dir values like "internal" or "macOS" (a change introduced in R4.0.2)
* [#630](tidyverse/lubridate#630) New parsing functions `ym()` and `my()`

### BUG FIXES

* [#930](tidyverse/lubridate#930) `as.period()` on intervals now returns valid Periods with double fields (not integers)



Version 1.7.9
=============

### NEW FEATURES

* [#871](tidyverse/lubridate#893) Add `vctrs` support


### BUG FIXES

* [#890](tidyverse/lubridate#890) Correctly compute year in `quarter(..., with_year = TRUE)`
* [#893](tidyverse/lubridate#893) Fix incorrect parsing of abbreviated months in locales with trailing dot (regression in v1.7.8)
* [#886](tidyverse/lubridate#886) Fix `with_tz()` for POSIXlt objects
* [#887](tidyverse/lubridate#887) Error on invalid numeric input to `month()`
* [#889](tidyverse/lubridate#889) Export new dmonth function

Version 1.7.8
=============

### NEW FEATURES

* (breaking) Year and month durations now assume 365.25 days in a year consistently in conversion and constructors. Particularly `dyears(1) == years(1)` is now `TRUE`.
* Format and print methods for 0-length objects are more consistent.
* New duration constructor `dmonths()` to complement other duration constructors.
*
* `duration()` constructor now accepts `months` and `years` arguments.
* [#629](tidyverse/lubridate#629) Added `format_ISO8601()` methods.
* [#672](tidyverse/lubridate#672) Eliminate all partial argument matches
* [#674](tidyverse/lubridate#674) `as_date()` now ignores the `tz` argument
* [#675](tidyverse/lubridate#675) `force_tz()`, `with_tz()`, `tz<-` convert dates to date-times
* [#681](tidyverse/lubridate#681) New constants `NA_Date_` and `NA_POSIXct_` which parallel built-in primitive constants.
* [#681](tidyverse/lubridate#681) New constructors `Date()` and `POSIXct()` which parallel built-in primitive constructors.
* [#695](tidyverse/lubridate#695) Durations can now be compared with numeric vectors.
* [#707](tidyverse/lubridate#707) Constructors return 0-length inputs when called with no arguments
* [#713](tidyverse/lubridate#713) (breaking) `as_datetime()` always returns a `POSIXct()`
* [#717](tidyverse/lubridate#717) Common generics are now defined in `generics` dependency package.
* [#719](tidyverse/lubridate#719) Negative Durations are now displayed with leading `-`.
* [#829](tidyverse/lubridate#829) `%within%` throws more meaningful messages when applied on unsupported classes
* [#831](tidyverse/lubridate#831) Changing hour, minute or second of Date object now yields POSIXct.
* [#869](tidyverse/lubridate#869) Propagate NAs to all internal components of a Period object

### BUG FIXES

* [#682](tidyverse/lubridate#682) Fix quarter extraction with small `fiscal_start`s.
* [#703](tidyverse/lubridate#703) `leap_year()` works with objects supported by `year()`.
* [#778](tidyverse/lubridate#778) `duration()/period()/make_difftime()` work with repeated units
* `c.Period` concatenation doesn't fail with empty components.
* Honor `exact = TRUE` argument in `parse_date_time2`, which was so far ignored.

Version 1.7.4
=============

### NEW FEATURES

* [#658](tidyverse/lubridate#658) `%within%` now accepts a list of intervals, in which case an instant is checked if it occurs within any of the supplied intervals.

### CHANGES

* [#661](tidyverse/lubridate#661) Throw error on invalid multi-unit rounding.
* [#633](tidyverse/lubridate#633) `%%` on intervals relies on `%m+` arithmetic and doesn't produce NAs when intermediate computations result in non-existent dates.
* `tz()` always returns "UTC" when `tzone` attribute cannot be inferred.

### BUG FIXES

* [#664](tidyverse/lubridate#664) Fix lookup of period functions in `as.period`
* [#649](tidyverse/lubridate#664) Fix system timezone memoization

Version 1.7.3
=============

### BUG FIXES

* [#643](tidyverse/lubridate#643), [#640](tidyverse/lubridate#640), [#645](tidyverse/lubridate#645) Fix faulty caching of system timezone.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants