Checksumming, other constraints and asserts validation and exceptions #81

KOLANICH · 2017-01-13T23:01:46Z

http://download.intel.com/support/motherboards/desktop/sb/pnpbiosspecificationv10a.pdf

Checksum - Each Expansion Header is checksummed individually. This allows the software that wishes
to make use of an expansion header the ability to determine if the expansion header is valid. The system software can determine if the expansion header is valid by performing a Checksum operation. The method for validating the checksum is to add up Length bytes, including the Checksum field, into an 8-bit value. A resulting sum of zero indicates a valid checksum operation.

So we need a way to verify checksums.

the proposed way is if: checksum_checksum_type(....) == expected_value
where checksum_checksum_type calculates checksum, in the case from quote it sums every byte in the structure modulo 256 and we check if: checksum_sum() == 0.

The text was updated successfully, but these errors were encountered:

GreyCat · 2017-01-14T00:03:49Z

Calculating checksums is a fairly complex issue, and, to be fair, it is not needed for the primary task that KS performs, i.e. parsing. Calculating checksums, of course, would be most useful for #27.

As far as I see it, there are several subproblems with this one:

We need to be able to identify addresses / ranges for multiple-bytes operations; this should be some sort of very flexible syntax that would allow to quickly address "whole structure", "whole structure minus current (checksum) field", "structure X, bytes ranges [r1..r2] and [r3..r4]", etc.
We need to specify checksum operation itself:
- For "standard" algorithms (CRC, various hashes), it's better to just have some sort of dictionary that will result in accessing native (quick) target language methods => but we need to gather information on which standard algorithms are implemented in which languages' stdlibs and how to access them
- For non-standard algorithms (like Java's hashCode), we need to implement array processing functions (something like map, each, etc) in expression language
As for parsing support, we need to be able to trigger some sort of exception or warning when something wrong is encountered

@KOLANICH note that if is already used for conditional parsing. More likely we'd use something like check or guard or something like that.

KOLANICH · 2017-01-15T00:45:38Z

@KOLANICH note that if is already used for conditional parsing. More likely we'd use something like check or guard or something like that.

how about throw?

throw: #```throw``` block is proposed to be an ```instances``` block bound to the generated exception object but with one-time (at the time of creation) access to all the local variables.
  if: checksum_checksum_type(....) != expected_value
  result: # the member name in exception object
    type: u1 # the type of member
    enum: error # the values are from the ```error``` enum. 
    value: 'error::it_happened_again' # the value set to a member. Essentially it is the same that a ```value``` in instances, but we should be able to use visible variables here.
........
enums:
  error:
    1: it_happenned_again
    2: that_happenned_again

ghost · 2017-01-31T14:41:36Z

My issue is that I have a function for checksumming, offered by the protocol developers. It expects the pointer to the parsed structure and its size in bytes. Then it does some operations on it and returns the checksum. Something like:

uint8_t CalcCRC(uint8_t *buf, uint8_t PktSize)
{
    uint8_t bytes_amount = PktSize - 1;
    do
    {
        //here it does something with the data bytewise 
        bytes_amount--;
    } while (bytes_amount);
    return *crc;
}

All I need to do is to give the CalcCRC the pointer to the structure, but since kaitai structure contains vectors and strings, the function does not receive the initial package that way.

What I do now is I read the stream into some dummy array and give the CalcCRC pointer to it. If the sum is okay, I seek PktSize bytes back and read the data again, but into kaitai structure that time.

I thought maybe there could be a more elegant solution.

GreyCat · 2017-01-31T14:55:54Z

@MariaChakchurina I believe, you can do something like that:

my_struct_t my_struct(&ks);

my_struct._io()->seek(0);
std::string buf = my_struct._io()->read_bytes_full();
uint8_t crc = CalcCRC(&buf[0], buf.size());
if (crc != my_struct.crc_field()) {
    // throw some exception or something
}

It's not a huge difference, but at least you don't have to derive CRC field from your structure manually.

ghost · 2017-02-02T16:35:47Z

I have multiple datagrams in a stream so I can't move the pointer to its beginning, but want to return data_size bytes back, so I better do:

    d._io()->seek(d._io()->pos() - data_size);
    std::string buf = d._io()->read_bytes(data_size);
    uint8_t crc = CalcCRC((uint8_t*)&buf[0], data_size);

Just letting you know;)

GreyCat · 2017-02-02T16:37:00Z

Can't you do one substream per datagram then? It should be much more convenient.

ghost · 2017-02-03T14:27:12Z

I found #33 and #44, but still don't understand how I can do that. I really need a kaitai wiki.

GreyCat · 2017-02-04T00:02:07Z

Um, #33 seems to have nothing to do with this topic, and #44 is planned future, which will optimize things a little, but you can still use substreams right now.

Generally, when you do:

seq:
  - id: packet
    type: some_datagram_type

you don't create a substream, i.e. stream for the some_datagram_type will be reused, and it won't start reading from 0.

However, if you'll do something like that:

seq:
  - id: packet
    type: some_datagram_type
    size: 100

=> it will create a substream in some_datagram_type. It will start from 0, and is guaranteed to have exactly 100 bytes in it (or whatever you'll specify). If you'll access _io for some_datagram_type object, you'll get access only to these 100 bytes, starting from 0.

ghost · 2017-02-06T17:13:14Z

Excuse me, by 33 I meant that one https://github.com/kaitai-io/kaitai_struct_compiler/issues/33. Thank you, now I got it.

GreyCat · 2017-07-16T14:54:58Z

@KOLANICH Just for the record, I don't understand most of this proposal:

how about throw?

throw:
  if: checksum_checksum_type(....) != expected_value
  result: 
    type: u1
    enum: error
    value: it_happened_again

What is type? What is enum? That is value? If it throws an exception, then, by definition, it can't set an attribute to anything, it just stops execution there and unwinds the stack.

KOLANICH · 2017-07-17T07:37:09Z

What is `type`?

The type of the field with the id "result" in exception object.

What is `enum`?

the "result" in the example is the error code from the "error" enum, so

That is `value`?

The value of the result. There is a syntax error here: I should have used ""error::it_happenned_again""". I have fixed and amended the example. Any expression can be allowed for instances. ```throw``` block is proposed to be an ```instances``` block bound to the generated exception object but with one-time (at the time of creation) access to all the local variables.

koczkatamas · 2017-07-17T08:31:20Z

You cannot throw arbitrary objects in several supported languages, so with this approach we should create a distinct exception class (according to the result type) on-the-fly which would make the parameters obsolate (as the class would identify the exception by itself).

What came into my mind regarding exceptions:

we could store this (as reference) into the exception and/or parse every object as in debug mode: even in the case of an exception, the object would still be accessible (so postprocessing would be still possible even after an exception)
we should make exceptions distinct, so the receiver can identify exactly which (ksy / compiled code) line/field/property/etc caused the exception (if he/she wants):
- use string identifier, it's advantage is that it's easier to implement, but we have to make class specific and distinct inside the class (so if the exception name is invalid_magic in the PNG class then we have to create a string png.invalid_magic for example). Of course the "PNG" name can conflict so it would be better to use concrete namespace suggested in Ksy modularization / external ksy inclusion (discussion) #71 like io.kaitai.image.png.invalid_magic.
- use enum: in this way every distinct exception would have a distinct enum value, but if we don't want put these enums into the exception object as a generic type (eg. object), then we should create an exception class per normal class. This has the advantage that these exceptions can be caught class-specifically at a higher level and the enum value can be checked without casting. This way the this reference could be specific (and not only object).
- use parametric types suggested by Parametric types #192.

Of course even if we create specific exceptions (option 2 & 3) those should be inherited from a common exception class which is defined in the runtime. So every Kaitai exception could be caught at once if needed.

KOLANICH · 2017-07-17T10:10:25Z

which would make the parameters obsolate (as the class would identify the exception by itself).

Are you going to have a separate exception class for every u8 number and a separate exception class for every string possible and how are you going to achieve this? We would like to provide some context without ruining exception safety (on exception everything over the catch must be destroyed, the resources freed, not leaked). So we would need some context to be able to give the error messages like "invalid opcode: foo", "100 (offset) > 2 (length)". Do do distinguishing between exceptions right we need runtime-parameterized types with instantiation syntax first. Another approach is to have only one exception type per type and throw only it from the type, but this approach is crippled.

KOLANICH · 2023-07-14T13:56:54Z

Forgot to add xrefs here: kaitai-io/kaitai_struct_formats#625 kaitai-io/kaitai_struct_formats#624

GreyCat added the enhancement label Feb 21, 2017

GreyCat mentioned this issue Apr 24, 2017

Allow repeat: until to optionally exclude the last non-matching element (look-ahead feature) #156

Open

This was referenced Jul 5, 2017

how to validate filed with custom function? #199

Closed

Range Validation #201

Closed

KOLANICH mentioned this issue Jul 15, 2017

Signaling errors in switch constructs #208

Open

GreyCat mentioned this issue Aug 24, 2017

Decoupling visualizer vs parser-running engine (visualizer protocol) #143

Open

KOLANICH mentioned this issue Feb 2, 2018

Container special type #340

Open

KOLANICH changed the title ~~Checksumming~~ Checksumming, other constraints and asserts validation and exceptions Feb 8, 2018

GreyCat mentioned this issue Mar 13, 2018

Is it possible to store the current parse index only? #378

Open

GreyCat mentioned this issue Apr 10, 2018

Make "contents" check available for individual bits? #380

Open

KOLANICH mentioned this issue May 13, 2018

Signatures as a special case of enums #434

Open

GreyCat mentioned this issue May 15, 2018

Validation of parsed data with assertions #435

Open

nigelht mentioned this issue Sep 5, 2018

Target Format Transpiler #471

Closed

KOLANICH mentioned this issue Mar 10, 2019

Add the different variants of the Unix ar format kaitai-io/kaitai_struct_formats#126

Open

KOLANICH mentioned this issue May 9, 2019

Support custom size bits or just bits in magic signatures #578

Open

dgelessus mentioned this issue May 20, 2019

Add Python Pickle format kaitai-io/kaitai_struct_formats#150

Merged

1 task

KOLANICH mentioned this issue Sep 22, 2019

Fix WAV and add RIFF kaitai-io/kaitai_struct_formats#229

Merged

KOLANICH mentioned this issue Sep 22, 2019

Resource Interchange File Format kaitai-io/kaitai_struct_formats#230

Closed

KOLANICH mentioned this issue Oct 18, 2019

Switch fall-through #638

Closed

GreyCat mentioned this issue May 15, 2020

Add btrfs-stream format kaitai-io/kaitai_struct_formats#286

Merged

GreyCat mentioned this issue Jul 14, 2023

How to get the size of the read record and its source buffer for crc validation? #1046

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Checksumming, other constraints and asserts validation and exceptions #81

Checksumming, other constraints and asserts validation and exceptions #81

KOLANICH commented Jan 13, 2017 •

edited

Loading

GreyCat commented Jan 14, 2017

KOLANICH commented Jan 15, 2017 •

edited

Loading

ghost commented Jan 31, 2017 •

edited by ghost

Loading

GreyCat commented Jan 31, 2017

ghost commented Feb 2, 2017

GreyCat commented Feb 2, 2017

ghost commented Feb 3, 2017 •

edited by ghost

Loading

GreyCat commented Feb 4, 2017

ghost commented Feb 6, 2017 •

edited by ghost

Loading

GreyCat commented Jul 16, 2017 •

edited

Loading

KOLANICH commented Jul 17, 2017 via email •

edited

Loading

koczkatamas commented Jul 17, 2017 •

edited

Loading

KOLANICH commented Jul 17, 2017 via email •

edited

Loading

KOLANICH commented Jul 14, 2023

Checksumming, other constraints and asserts validation and exceptions #81

Checksumming, other constraints and asserts validation and exceptions #81

Comments

KOLANICH commented Jan 13, 2017 • edited Loading

GreyCat commented Jan 14, 2017

KOLANICH commented Jan 15, 2017 • edited Loading

ghost commented Jan 31, 2017 • edited by ghost Loading

GreyCat commented Jan 31, 2017

ghost commented Feb 2, 2017

GreyCat commented Feb 2, 2017

ghost commented Feb 3, 2017 • edited by ghost Loading

GreyCat commented Feb 4, 2017

ghost commented Feb 6, 2017 • edited by ghost Loading

GreyCat commented Jul 16, 2017 • edited Loading

KOLANICH commented Jul 17, 2017 via email • edited Loading

koczkatamas commented Jul 17, 2017 • edited Loading

KOLANICH commented Jul 17, 2017 via email • edited Loading

KOLANICH commented Jul 14, 2023

KOLANICH commented Jan 13, 2017 •

edited

Loading

KOLANICH commented Jan 15, 2017 •

edited

Loading

ghost commented Jan 31, 2017 •

edited by ghost

Loading

ghost commented Feb 3, 2017 •

edited by ghost

Loading

ghost commented Feb 6, 2017 •

edited by ghost

Loading

GreyCat commented Jul 16, 2017 •

edited

Loading

KOLANICH commented Jul 17, 2017 via email •

edited

Loading

koczkatamas commented Jul 17, 2017 •

edited

Loading

KOLANICH commented Jul 17, 2017 via email •

edited

Loading