Skip to content
This repository was archived by the owner on May 4, 2024. It is now read-only.

Conversation

@davidiw
Copy link
Collaborator

@davidiw davidiw commented Oct 2, 2022

Limit parsing of nested type tags in the type parser

Copy link
Collaborator Author

@davidiw davidiw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressing other comments...

Copy link

@igor-aptos igor-aptos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, don't have approval rights for this repo :)

Comment on lines 23 to 24
#[cfg(test)]
const MAX_DE_TYPE_TAG_NESTING: u8 = MAX_TYPE_TAG_NESTING - 1;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to confirm, this is there just so we can check that both serialization and deserialization fails, correct?

wanna add a comment for that?

@dariorussi
Copy link
Member

quick question: when do you plan/want this to get in?
this is really something I would love @vgao1996 to review.
That can happen after too but I really think he has the best context to evaluate this.
In any case review coming soon....

Copy link
Member

@dariorussi dariorussi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks good to me but @vgao1996 please review as you have a chance tomorrow

Copy link
Member

@wrwg wrwg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an awesome hack and re-engineering of parts of serde! However, I'm bit concerned about the complexity here (and the possibility of new bugs). Lets discuss offline.

}

#[derive(Serialize)]
enum SafeTypeTag<'a> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to call this TypeTagForSerialize. There is nothing safe about it besides how you use it.

}
}

// u8 -> depth or nesting level
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you trying to say that the u8 is the depth? Then please do so, the comment is cryptic otherwise

// u8 -> depth or nesting level
struct TypeTagVisitor(u8);

use serde::de::{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please ensure to format before sending? I was spending a while wondering where your Visitor and other types come from, as I did not see them imported on the top of the file (where expected).

self, DeserializeSeed, Deserializer, EnumAccess, MapAccess, SeqAccess, VariantAccess, Visitor,
};

impl<'de> Visitor<'de> for TypeTagVisitor {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be nice to see some commenting here, like

// Manual implementation of what serde would derive automatically, but with an injected bound. We are using this 
// here as currently no configurable type depth is supported by serde. See [link if there is something to see].

Or whatever makes sense. Otherwise, someone seeing this code here would just declare us as insane.

Also as you seem to have reverse engineered Serde, it would be nice you give some hints so others don't need to do the same as far as this is possible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is how I keep my job...

where
A: EnumAccess<'de>,
{
let (variant_type, variant) = data.variant::<DeTypeTag>()?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we get a match here of DeTypeTag vs the actual type? By position in the variant list of enum? If so it would be helpful if that has been documented with a short sentence at DeTypeTag.

Ok(TypeTag::Signer)
}
DeTypeTag::Vector => Ok(TypeTag::Vector(Box::new(
variant.newtype_variant_seed(NestedTypeTag(self.0))?,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just make self.0 be self.nest or something.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hate extra code :)

where
D: Deserializer<'de>,
{
if self.0 >= MAX_DE_TYPE_TAG_NESTING {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps move the condition and the error generation into some check_nest(self.0)? as it is used more than once.

Copy link
Collaborator Author

@davidiw davidiw Oct 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used by two different objects, but I can see the point

)));
}

deserializer.deserialize_seq(VecTypeTagVisitor(self.0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.0 + 1?

If this was actually omitted (and after you already had a review), consider what this means for in-depth defense. Those also have bugs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at this point, we haven't really picked up a new type tag. the type tag will be incremented for each actual TypeTag we visit.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in other words... comment please

}
}

const TYPE_TAG_VARIANTS: &[&str] = &["bool", "u8", "u64", "u128", "signer", "vector", "struct"];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the const decls to the front? This was used up somewhere already IIRC.

// u8 -> depth or nesting level
struct NestedStructTag(u8);

impl<'de> DeserializeSeed<'de> for NestedStructTag {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this closer to the struct stuff?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect one could bikeshed a lot about the organization of this file. I might argue putting the public types at the top is more readable. I could also see an argument about relational placement.

@wrwg
Copy link
Member

wrwg commented Oct 3, 2022

I've a potential alternative (simpler) solution. It involves thread locals. I hope I can send for comparison before EOD.

@davidiw
Copy link
Collaborator Author

davidiw commented Oct 3, 2022

I'd like to move fast here. If we can come up with a generalizable scheme, I'm all game. I do think that this is a rather safe PR and something that would definitely be beneficial upstream. It is a mess that serde doesn't support limited nesting at a struct level.

wrwg added a commit to wrwg/move-lang that referenced this pull request Oct 3, 2022
This is an alternative version of what move-language#528 and the previous for limitation of nesting provides.

This approach installs custom serializers at the recursion points of the `TypeTag` enum. Those serializer use a thread local to track the nesting of the type tag, both for serialization and deserialization.

The usage of a thread local is safe and a common technique to provide extra state for implementations which miss it (e.g. serde). A given thread has a single execution stack and this will always count the depth counter consistently up and down. Because Rust has no exceptions are other exit points, this cannot be corrupted. Possibly, thread locals may cause inefficiency. However, we expect TypeTags to be small (a few constructors, limited nestings).

The technique used for type tags can be generalized for any other recursive type if needed. Eventually, we hope serde will support per-type nesting, and possibly we should talk with the author if we can add it ourselves.

This uses and passes the tests from move-language#528.
wrwg added a commit to wrwg/move-lang that referenced this pull request Oct 3, 2022
Cherry picked from move-language#529.

This is an alternative version of what move-language#528 and the previous for limitation of nesting provides, having a significant lower footprint.

This approach installs custom serializers at the recursion points of the `TypeTag` enum. Those serializer use a thread local to track the nesting of the type tag, both for serialization and deserialization.

The usage of a thread local is safe and a common technique to provide extra state for implementations which miss it (e.g. serde). A given thread has a single execution stack and this will always count the depth counter consistently up and down. Because Rust has no exceptions are other exit points, this cannot be corrupted. Possibly, thread locals may cause inefficiency. However, we expect TypeTags to be small (a few constructors, limited nestings).

The technique used for type tags can be generalized for any other recursive type if needed. Eventually, we hope serde will support per-type nesting, and possibly we should talk with the author if we can add it ourselves.

This uses and passes the tests from move-language#528.
wrwg added a commit to wrwg/move-lang that referenced this pull request Oct 3, 2022
Cherry picked from move-language#529.

This is an alternative version of what move-language#528 and the previous for limitation of nesting provides, having a significant lower footprint.

This approach installs custom serializers at the recursion points of the `TypeTag` enum. Those serializer use a thread local to track the nesting of the type tag, both for serialization and deserialization.

The usage of a thread local is safe and a common technique to provide extra state for implementations which miss it (e.g. serde). A given thread has a single execution stack and this will always count the depth counter consistently up and down. Because Rust has no exceptions are other exit points, this cannot be corrupted. Possibly, thread locals may cause inefficiency. However, we expect TypeTags to be small (a few constructors, limited nestings).

The technique used for type tags can be generalized for any other recursive type if needed. Eventually, we hope serde will support per-type nesting, and possibly we should talk with the author if we can add it ourselves.

This uses and passes the tests from move-language#528.
wrwg added a commit to wrwg/move-lang that referenced this pull request Oct 3, 2022
Cherry picked from move-language#529.

This is an alternative version of what move-language#528 and the previous for limitation of nesting provides, having a significant lower footprint.

This approach installs custom serializers at the recursion points of the `TypeTag` enum. Those serializer use a thread local to track the nesting of the type tag, both for serialization and deserialization.

The usage of a thread local is safe and a common technique to provide extra state for implementations which miss it (e.g. serde). A given thread has a single execution stack and this will always count the depth counter consistently up and down. Because Rust has no exceptions are other exit points, this cannot be corrupted. Possibly, thread locals may cause inefficiency. However, we expect TypeTags to be small (a few constructors, limited nestings).

The technique used for type tags can be generalized for any other recursive type if needed. Eventually, we hope serde will support per-type nesting, and possibly we should talk with the author if we can add it ourselves.

This uses and passes the tests from move-language#528.
wrwg added a commit to wrwg/move-lang that referenced this pull request Oct 3, 2022
Cherry picked from move-language#529.

This is an alternative version of what move-language#528 and the previous for limitation of nesting provides, having a significant lower footprint.

This approach installs custom serializers at the recursion points of the `TypeTag` enum. Those serializer use a thread local to track the nesting of the type tag, both for serialization and deserialization.

The usage of a thread local is safe and a common technique to provide extra state for implementations which miss it (e.g. serde). A given thread has a single execution stack and this will always count the depth counter consistently up and down. Because Rust has no exceptions are other exit points, this cannot be corrupted. Possibly, thread locals may cause inefficiency. However, we expect TypeTags to be small (a few constructors, limited nestings).

The technique used for type tags can be generalized for any other recursive type if needed. Eventually, we hope serde will support per-type nesting, and possibly we should talk with the author if we can add it ourselves.

This uses and passes the tests from move-language#528.
wrwg added a commit that referenced this pull request Oct 3, 2022
Cherry picked from #529.

This is an alternative version of what #528 and the previous for limitation of nesting provides, having a significant lower footprint.

This approach installs custom serializers at the recursion points of the `TypeTag` enum. Those serializer use a thread local to track the nesting of the type tag, both for serialization and deserialization.

The usage of a thread local is safe and a common technique to provide extra state for implementations which miss it (e.g. serde). A given thread has a single execution stack and this will always count the depth counter consistently up and down. Because Rust has no exceptions are other exit points, this cannot be corrupted. Possibly, thread locals may cause inefficiency. However, we expect TypeTags to be small (a few constructors, limited nestings).

The technique used for type tags can be generalized for any other recursive type if needed. Eventually, we hope serde will support per-type nesting, and possibly we should talk with the author if we can add it ourselves.

This uses and passes the tests from #528.
@davidiw davidiw changed the title [types] make type tag nesting limited in deserialize [types] limit parsing of nested TypeTag Oct 3, 2022
@davidiw davidiw enabled auto-merge (rebase) October 3, 2022 22:40
There's limited value and a lot of wasted resources by leaving this
limited to serde's limits. Recursion is dangerous.
@davidiw davidiw merged commit d873870 into move-language:aptos Oct 3, 2022
wrwg added a commit to wrwg/move-lang that referenced this pull request Oct 6, 2022
Cherry picked from move-language#529.

This is an alternative version of what move-language#528 and the previous for limitation of nesting provides, having a significant lower footprint.

This approach installs custom serializers at the recursion points of the `TypeTag` enum. Those serializer use a thread local to track the nesting of the type tag, both for serialization and deserialization.

The usage of a thread local is safe and a common technique to provide extra state for implementations which miss it (e.g. serde). A given thread has a single execution stack and this will always count the depth counter consistently up and down. Because Rust has no exceptions are other exit points, this cannot be corrupted. Possibly, thread locals may cause inefficiency. However, we expect TypeTags to be small (a few constructors, limited nestings).

The technique used for type tags can be generalized for any other recursive type if needed. Eventually, we hope serde will support per-type nesting, and possibly we should talk with the author if we can add it ourselves.

This uses and passes the tests from move-language#528.
wrwg added a commit to wrwg/move-lang that referenced this pull request Jan 3, 2023
Cherry picked from move-language#529.

This is an alternative version of what move-language#528 and the previous for limitation of nesting provides, having a significant lower footprint.

This approach installs custom serializers at the recursion points of the `TypeTag` enum. Those serializer use a thread local to track the nesting of the type tag, both for serialization and deserialization.

The usage of a thread local is safe and a common technique to provide extra state for implementations which miss it (e.g. serde). A given thread has a single execution stack and this will always count the depth counter consistently up and down. Because Rust has no exceptions are other exit points, this cannot be corrupted. Possibly, thread locals may cause inefficiency. However, we expect TypeTags to be small (a few constructors, limited nestings).

The technique used for type tags can be generalized for any other recursive type if needed. Eventually, we hope serde will support per-type nesting, and possibly we should talk with the author if we can add it ourselves.

This uses and passes the tests from move-language#528.
wrwg added a commit that referenced this pull request Jan 3, 2023
* [types] make type tag nesting limited

There's limited value and a lot of wasted resources by leaving this
limited to serde's limits. Recursion is dangerous.

* [types] limit nesting of types via state (#532)

Cherry picked from #529.

This is an alternative version of what #528 and the previous for limitation of nesting provides, having a significant lower footprint.

This approach installs custom serializers at the recursion points of the `TypeTag` enum. Those serializer use a thread local to track the nesting of the type tag, both for serialization and deserialization.

The usage of a thread local is safe and a common technique to provide extra state for implementations which miss it (e.g. serde). A given thread has a single execution stack and this will always count the depth counter consistently up and down. Because Rust has no exceptions are other exit points, this cannot be corrupted. Possibly, thread locals may cause inefficiency. However, we expect TypeTags to be small (a few constructors, limited nestings).

The technique used for type tags can be generalized for any other recursive type if needed. Eventually, we hope serde will support per-type nesting, and possibly we should talk with the author if we can add it ourselves.

This uses and passes the tests from #528.

* [types] limit parsing of nested TypeTag

There's limited value and a lot of wasted resources by leaving this
limited to serde's limits. Recursion is dangerous.

Co-authored-by: David Wolinsky <isaac.wolinsky@gmail.com>
wrwg added a commit to wrwg/move-lang that referenced this pull request Jan 3, 2023
* [types] make type tag nesting limited

There's limited value and a lot of wasted resources by leaving this
limited to serde's limits. Recursion is dangerous.

* [types] limit nesting of types via state (move-language#532)

Cherry picked from move-language#529.

This is an alternative version of what move-language#528 and the previous for limitation of nesting provides, having a significant lower footprint.

This approach installs custom serializers at the recursion points of the `TypeTag` enum. Those serializer use a thread local to track the nesting of the type tag, both for serialization and deserialization.

The usage of a thread local is safe and a common technique to provide extra state for implementations which miss it (e.g. serde). A given thread has a single execution stack and this will always count the depth counter consistently up and down. Because Rust has no exceptions are other exit points, this cannot be corrupted. Possibly, thread locals may cause inefficiency. However, we expect TypeTags to be small (a few constructors, limited nestings).

The technique used for type tags can be generalized for any other recursive type if needed. Eventually, we hope serde will support per-type nesting, and possibly we should talk with the author if we can add it ourselves.

This uses and passes the tests from move-language#528.

* [types] limit parsing of nested TypeTag

There's limited value and a lot of wasted resources by leaving this
limited to serde's limits. Recursion is dangerous.

Co-authored-by: David Wolinsky <isaac.wolinsky@gmail.com>
wrwg added a commit to wrwg/move-lang that referenced this pull request Jan 3, 2023
* [types] make type tag nesting limited

There's limited value and a lot of wasted resources by leaving this
limited to serde's limits. Recursion is dangerous.

* [types] limit nesting of types via state (move-language#532)

Cherry picked from move-language#529.

This is an alternative version of what move-language#528 and the previous for limitation of nesting provides, having a significant lower footprint.

This approach installs custom serializers at the recursion points of the `TypeTag` enum. Those serializer use a thread local to track the nesting of the type tag, both for serialization and deserialization.

The usage of a thread local is safe and a common technique to provide extra state for implementations which miss it (e.g. serde). A given thread has a single execution stack and this will always count the depth counter consistently up and down. Because Rust has no exceptions are other exit points, this cannot be corrupted. Possibly, thread locals may cause inefficiency. However, we expect TypeTags to be small (a few constructors, limited nestings).

The technique used for type tags can be generalized for any other recursive type if needed. Eventually, we hope serde will support per-type nesting, and possibly we should talk with the author if we can add it ourselves.

This uses and passes the tests from move-language#528.

* [types] limit parsing of nested TypeTag

There's limited value and a lot of wasted resources by leaving this
limited to serde's limits. Recursion is dangerous.

Co-authored-by: David Wolinsky <isaac.wolinsky@gmail.com>
wrwg added a commit that referenced this pull request Jan 3, 2023
* [types] make type tag nesting limited

There's limited value and a lot of wasted resources by leaving this
limited to serde's limits. Recursion is dangerous.

* [types] limit nesting of types via state (#532)

Cherry picked from #529.

This is an alternative version of what #528 and the previous for limitation of nesting provides, having a significant lower footprint.

This approach installs custom serializers at the recursion points of the `TypeTag` enum. Those serializer use a thread local to track the nesting of the type tag, both for serialization and deserialization.

The usage of a thread local is safe and a common technique to provide extra state for implementations which miss it (e.g. serde). A given thread has a single execution stack and this will always count the depth counter consistently up and down. Because Rust has no exceptions are other exit points, this cannot be corrupted. Possibly, thread locals may cause inefficiency. However, we expect TypeTags to be small (a few constructors, limited nestings).

The technique used for type tags can be generalized for any other recursive type if needed. Eventually, we hope serde will support per-type nesting, and possibly we should talk with the author if we can add it ourselves.

This uses and passes the tests from #528.

* [types] limit parsing of nested TypeTag

There's limited value and a lot of wasted resources by leaving this
limited to serde's limits. Recursion is dangerous.

Co-authored-by: David Wolinsky <isaac.wolinsky@gmail.com>

Co-authored-by: David Wolinsky <isaac.wolinsky@gmail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants