You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, rbx-dom throws a pretty nondescript error when an instance has a name containing invalid utf-8:
[TRACE rbx_binary::deserializer::state] Known prop, canonical name C0 and type CFrame
[TRACE rbx_binary::chunk] Chunk "PROP" (compressed: 24, len: 37, reserved: 0)
[TRACE rbx_binary::deserializer::state] PROP chunk (RotateP.C1, instance type 59, prop type 59
[TRACE rbx_binary::deserializer::state] Known prop, canonical name C1 and type CFrame
[TRACE rbx_binary::chunk] Chunk "PROP" (compressed: 20, len: 18, reserved: 0)
[TRACE rbx_binary::deserializer::state] PROP chunk (RotateP.Enabled, instance type 59, prop type 59
[TRACE rbx_binary::deserializer::state] Known prop, canonical name Enabled and type Bool
[TRACE rbx_binary::chunk] Chunk "PROP" (compressed: 30, len: 50, reserved: 0)
[TRACE rbx_binary::chunk] Chunk "PROP" (compressed: 69, len: 103, reserved: 0)
[TRACE rbx_binary::deserializer::state] PROP chunk (RotateP.Name, instance type 59, prop type 59
stream did not contain valid UTF-8
This particular instance is more common in Roblox than one might think and is known as a "virus" instance, one created by a malicious script or plugin to be intentionally annoying or disruptive. Developers that get this error generally don't know that they have these, and the instances are also generally harmless, but make deserialization in rbx-dom fail with the above error message.
It would be nice to throw a better error here (and maybe in other cases for invalid utf-8), maybe even mentioning the exact instance and where it is so developers can deal with it accordingly.
The text was updated successfully, but these errors were encountered:
Looked into this more, and it's going to be difficult (or impossible) to give much useful info (e.g. the instance's full name) in these errors because the DOM is necessarily incomplete in the middle of a deserialization operation.
Maybe instead of more specific errors, we could just do a lossy conversion for non UTF-8 Variant::String properties? The main drawback is just the fact the original data would be altered, and there may be legitimate uses for non UTF-8 string properties!
Alternatively, we could start reading Variant::String props into a Vec<u8> instead of a String. This would be more correct, but it would be a breaking change and makes string properties more complicated to work with for consumers.
Alternatively, we could start reading Variant::String props into a Vec instead of a String. This would be more correct, but it would be a breaking change and makes string properties more complicated to work with for consumers.
This actually sounds great for Lune as a consumer since we could, in many cases, skip the entire conversion step between dom string -> rust string -> lua (c) string. We could also unify some code we have for differences in handling with normal strings / binary strings. I can't really speak for Rojo or any other tools (are there any other than Remodel/Lune?) though.
Currently,
rbx-dom
throws a pretty nondescript error when an instance has a name containing invalid utf-8:This particular instance is more common in Roblox than one might think and is known as a "virus" instance, one created by a malicious script or plugin to be intentionally annoying or disruptive. Developers that get this error generally don't know that they have these, and the instances are also generally harmless, but make deserialization in
rbx-dom
fail with the above error message.It would be nice to throw a better error here (and maybe in other cases for invalid utf-8), maybe even mentioning the exact instance and where it is so developers can deal with it accordingly.
The text was updated successfully, but these errors were encountered: