Add a Rational Number type #164

robUx4 · 2017-08-09T09:43:43Z

Since some Matroska elements would be better as rational numbers we need a way to store them. I think a type would be better than using two EBML Signed Integers. The bits in the VDATA would be divided in half. The first part (signed) for the numerator, the second part (signed?) for the denominator.

The text was updated successfully, but these errors were encountered:

retokromer · 2017-08-09T12:09:56Z

Caveat: I didn’t check where this would be useful.

Generally speaking, in my personal experience, it’s however easier to implement the rational number type by two regular integers.

dericed · 2017-08-09T18:30:36Z

How do you understand the boundary between the signed integers? Advantage to store num and den as a single value?

robUx4 · 2017-08-19T07:59:35Z

It might use less space, although it depends on the length of the ID and the value itself.
when reading the value the parser doesn't have to keep the first value in a state before using the next value
if a rational element can be found more than one (a list of values) it's not possible to do it with 2 elements, or you need to wrap them in a master element

robUx4 · 2020-10-11T07:58:05Z

you can't have one value and not the other. You have both or none.

hubblec4 · 2021-03-27T17:11:17Z

Is the rational type only for unsigned numbers or do we need a signed rational type too?

robUx4 · 2021-03-28T06:56:19Z

The first part (signed) for the numerator, the second part (signed?) for the denominator.

Signed (which includes unsigned apart from very large values)

mbunkus · 2021-03-28T13:43:52Z

If we really need such a field[1], I propose the following:

Starts with the regular EBML ID & element size fields; let's call this size field element_data_size (in bytes)
Following is a second size field encoded the same way the element size field is encoded; let's call this field numerator_size (in bytes) and the size of this field itself numerator_size_size (in bytes)
Following is an unsigned integer number of length numerator_size bytes; this is the numerator
Following is a signed integer number of length element_data_size - numerator_size_size - numerator_size; this is the denominator

Restrictions:

the value of the denominator MUST NOT be 0
the value of numerator_size_size + numerator_size MUST be less than element_data_size

Rationale:

Why make the denominator signed, not the numerator? Wild guess here: the denominator will stay constant (or almost constant) for most well-known applications such as video timestamps (e.g. the denominator staying at exactly 30 * 1.000.000.000/1001 or something like that to represent 29.975 frames per second). The numerator usually grows, so let's give it that one bit more to grow into.
If we need a rational type, using a packed structure does indeed save a lot of space. The only other possible method of implementation would be a master element with two integer children, one signed, one unsigned. That's quite a lot of useless overhead: two more IDs (at least two bytes), one useless size field (the one of the master element). And it introduces a number of corner cases such as what if an element is missing? How to specify default values for nested elements? Etc.

[1] Not sure if we really need this data type given the smaple-accurate calculation method you specified the other day.

robUx4 · 2021-04-04T08:10:36Z

I'd rather have [num_size][num_value][den_size][den_value]. In the case the rationale value is equal to an integer we just use [num_size][num_value] and assume the denominator is 1.

We do need a way to specify the default attribute values, like we have a format for floats. I suppose a string with "numerator/denominator" would work. Maybe make the denominator optional if we allow it, in which case such a default value would be "numerator".

mbunkus · 2021-04-04T08:58:56Z

We always have the size field for the full element. There's really no need for three size fields as we can always calculate the third size from the other two. It would be a waste of space, a significant one if we used a rational element in something like a hypothetical BlockV3 structure.

I agree about default values. Leaving out /denominator simply means that denominator is 1.

robUx4 · 2021-04-04T09:13:03Z

OK my bad, I thought what you proposed was [num_size][den_size][num_value][den_value].

I agree with [num_size][num_value][den_value], the [den_value] part being optional.

mbunkus · 2021-04-04T09:26:16Z

Great!

hubblec4 · 2021-04-04T14:47:51Z

I'm also agree with what @mbunkus proposed.
But one thing I would change: that is the size-field for the numerator [num_size].
This size-field could be a "normal" size integer not a VINT, which makes parsing faster.

Reason:
The size of the numerator(and also the size of denominator) is limited to 8, that means the length of the size-field is always 1.

[EBML-ID] [element size-field as VINT] [numerator: "normal" size-field as unsigend integer] [numerator value] [denominator value]

robUx4 · 2021-04-05T06:02:40Z

Yes, given the limitation of the width of the numerator a regular unsigned 1-octet integer is easier for everyone.

robUx4 · 2023-03-25T16:15:25Z

BTW we will probably need signed rational values. One way to encode the negative value (we should only allow one of the two values) could be to use the same encoding as the EBML-lacing in Matroska, see ietf-wg-cellar/matroska-specification#726 (comment)

mbunkus · 2023-03-25T17:21:36Z

In my original proposal the denominator is a signed integer already. Not sure what you're missing there.

Taking @hubblec4's suggestion into account, I revise my original proposal to:

Starts with the regular EBML ID & element size fields; let's call this size field element_data_size (in bytes)
Following is a one-byte long second size field, an unsigned integer specifying the length of the following numerator field; let's call this field numerator_size (in bytes)
Following is a numerator_size bytes long unsigned integer number in big-endian byte order; this is the numerator
If numerator_size is less than element_data_size: following is a element_data_size - numerator_size bytes long signed integer number; this is the denominator. The signed integer is stored in two's complement notation with the leftmost bit being the sign bit. If numerator_size equals element_data_size, there is no denominator field and the denominator's value is 1.

Restrictions:

the value of the denominator MUST NOT be 0
the value of numerator_size MUST be less than or equal to element_data_size

Rationale:

Why make the denominator signed, not the numerator? Wild guess here: the denominator will stay constant (or almost constant) for most well-known applications such as video timestamps (e.g. the denominator staying at exactly 30 * 1.000.000.000/1001 or something like that to represent 29.975 frames per second). The numerator usually grows, so let's give it that one bit more to grow into.
As a rational type might be used in each block, having a small encoding is a big plus — hence a packed structure instead of an EBML Master element with two children, one signed, one unsigned. That's quite a lot of useless overhead: two more IDs (at least two bytes), one useless size field (the one of the master element). And it introduces a number of corner cases such as what if an element is missing? How to specify default values for nested elements? Etc.

retokromer · 2023-03-25T17:39:56Z

In my original proposal the denominator is a signed integer already.

@mbunkus Out of curiosity: Is there a reason why you chose the denominator rather than the numerator?

mbunkus · 2023-03-25T17:47:28Z

I actually explain my reasoning in the first bullet point of "Rationale".

retokromer · 2023-03-25T18:11:06Z

@mbunkus Indeed, I saw it. I was asking, because I’m not convinced. I actually did the contrary (back in the 1980s and 1990s) when I implemented a few compilers. My reasoning, back then (!), was about rounding errors and processor cycles.

mbunkus · 2023-03-25T18:17:37Z

The rationals in media containers are somewhat unusual in that for most use cases their denominator remains constant.

I don't know whether having one more bit to play with in the numerator actually makes any noticeable difference wrt. file sizes. On the other hand, I'm even less convinced that making the numerator signed instead of the denominator has any measurable difference wrt. to processing time, given that a) reading an unsigned integer requires fewer instructions & b) that this is a media format — any potential miniscule gains while parsing the container are dwarfed by the time required for decoding the content anyway.

JeromeMartinez · 2023-03-25T18:21:09Z

I don't know whether having one more bit to play with in the numerator actually makes any noticeable difference wrt.

Still good to get IMO. The parser is in charge of rounding errors and actually could internally move the sign at the numerator if it wishes.

retokromer · 2023-03-25T18:28:08Z

The rationals in media containers are somewhat unusual in that for most use cases their denominator remains constant.

That could be a reason, yes. Thank you!

hubblec4 · 2023-03-26T22:54:39Z

a) reading an unsigned integer requires fewer instructions

For me is that the best argument that the numerator should be an unsigned integer.

And that's the reason why I want to speak about the third way for a signed rational number.
In mathematics it is possible/usual to set the sign before the rational number in the height of the fraction bar.
That means both numbers(numerator and denominator) are unsigned values.

Now we need only 1 bit to signal if a rational number is negative/signed or not.
My propose was(with @mbunkus words):

Following is a one-byte long second size field, an unsigned integer specifying the length of the following numerator field; let's call this field numerator_size (in bytes)

The numerator_size value has a range from 1 to 8.
That means only 3 Bits are used and we could use the first Bit for the sign option.
If the first Bit is set to 1, than is the rational number negative.

For example a rational number value: -"180/1"
The denominator has the default value "1" and can be omitted to keep this example simple.
The length of "180" is one byte. -> numerator_size in Bits 0-0-0-0-0-0-0-1
Now we set the sign Bit -> 1-0-0-0-0-0-0-1
It is more or less the same like a VINT with a length of one byte.

The word numerator_size have to changed because there are more information in this Byte now.
Maybe sign_and_numerator_size.

That was only some basis thoughts and I hope I'm not wasting your time.

robUx4 · 2023-03-27T06:34:23Z

In my original proposal the denominator is a signed integer already. Not sure what you're missing there.

I did not check. I just added a reference to an existing (efficient) EBML-like encoding for signed values, so we don't forget about it.

Looking at your proposal it seems easier to use than this (no need code needed for those not supporting EBML lacing in Matroska).

robUx4 added the format addition To consider in future EBML versions label Aug 9, 2017

robUx4 mentioned this issue Oct 11, 2020

Draft: Proposal to add some rational number in timestamps ietf-wg-cellar/matroska-specification#425

Open

hubblec4 mentioned this issue Feb 22, 2023

Support of an unique SMPTE ST 12 Timecode in track header ietf-wg-cellar/matroska-specification#684

Open

robUx4 mentioned this issue Jul 2, 2023

create a new document containing all errata and additions to RFC 8794 #412

Open

robUx4 added this to the new-rfc milestone Jul 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a Rational Number type #164

Add a Rational Number type #164

robUx4 commented Aug 9, 2017

retokromer commented Aug 9, 2017

dericed commented Aug 9, 2017

robUx4 commented Aug 19, 2017

robUx4 commented Oct 11, 2020

hubblec4 commented Mar 27, 2021

robUx4 commented Mar 28, 2021

mbunkus commented Mar 28, 2021

robUx4 commented Apr 4, 2021

mbunkus commented Apr 4, 2021

robUx4 commented Apr 4, 2021

mbunkus commented Apr 4, 2021

hubblec4 commented Apr 4, 2021 •

edited

Loading

robUx4 commented Apr 5, 2021 •

edited

Loading

robUx4 commented Mar 25, 2023

mbunkus commented Mar 25, 2023

retokromer commented Mar 25, 2023

mbunkus commented Mar 25, 2023

retokromer commented Mar 25, 2023

mbunkus commented Mar 25, 2023

JeromeMartinez commented Mar 25, 2023

retokromer commented Mar 25, 2023

hubblec4 commented Mar 26, 2023 •

edited

Loading

robUx4 commented Mar 27, 2023

Add a Rational Number type #164

Add a Rational Number type #164

Comments

robUx4 commented Aug 9, 2017

retokromer commented Aug 9, 2017

dericed commented Aug 9, 2017

robUx4 commented Aug 19, 2017

robUx4 commented Oct 11, 2020

hubblec4 commented Mar 27, 2021

robUx4 commented Mar 28, 2021

mbunkus commented Mar 28, 2021

robUx4 commented Apr 4, 2021

mbunkus commented Apr 4, 2021

robUx4 commented Apr 4, 2021

mbunkus commented Apr 4, 2021

hubblec4 commented Apr 4, 2021 • edited Loading

robUx4 commented Apr 5, 2021 • edited Loading

robUx4 commented Mar 25, 2023

mbunkus commented Mar 25, 2023

retokromer commented Mar 25, 2023

mbunkus commented Mar 25, 2023

retokromer commented Mar 25, 2023

mbunkus commented Mar 25, 2023

JeromeMartinez commented Mar 25, 2023

retokromer commented Mar 25, 2023

hubblec4 commented Mar 26, 2023 • edited Loading

robUx4 commented Mar 27, 2023

hubblec4 commented Apr 4, 2021 •

edited

Loading

robUx4 commented Apr 5, 2021 •

edited

Loading

hubblec4 commented Mar 26, 2023 •

edited

Loading