Method chain as item#3977
Conversation
| .directed_by("Yorgos Lanthimos"); | ||
| ``` | ||
|
|
||
| A method chain call is a single, atomic expression: only the complete chain, written out in |
There was a problem hiding this comment.
I don't understand how this is beneficial if the entire chain is required. Optionality compliments named parameters, and this lacks it entirely.
This is also quite similar to method chains in Dart. e.g. var foo = new Foo()..a()..b()..c(); which calls a, b, and c on the new instance, discards their return values, and then initializes the variable with the new instance. However Dart's version is much more versatile. A Rust equivalent would work for methods that take self by-ref and by-mut. This allows for optionality.
But Dart-like method chaining is still missing required named parameters. While this proposed feature can emulate those, it adds arbitrary restrictions to the ordering of the "method" invocations.
There was a problem hiding this comment.
This proposal is similar to Smalltalk/Objective-C style method calls (e.g. receiver moveWidgetFrom: arg1 to: arg2) in that it defines a fixed sequence of identifiers ("moveWidgetFrom:to:") which constitute mandatory naming for a fixed parameter list (arg1, arg2). However:
Declaring two method chains … with initial sections under the same name in the same scope is exactly as much a duplicate-definition error as declaring two ordinary functions under that name is today.
That is, the declaration of the chain .foo().bar() requires that the name foo does not collide with other chains or methods. That makes this proposal far more limited than Smalltalk/Objective-C, where foo:bar: and foo:baz: are two distinct methods that can coexist.
Per #future-possibilities, @contactomorph sees this as something that can be added later. Thus, this RFC is serving two purposes, as I see it:
- Adding a mechanism to rigorously name arguments, without the need to define a builder type and spread the documentation of the operation around that builder's items.
- Being the foundation to later add branching/optional/repeated chain elements, thus becoming a means of overloading and a replacement for some but not all builders.
Given that, the questions that need answering are:
- Is (1) worthwhile on its own?
- If so, is this design a good way to achieve (1)?
- Should Rust accept (1) now for the sake of getting (2) later?
| - **You cannot skip a section, call sections out of order, call a section twice, or stop | ||
| partway.** Only `define_movie(…).released_in(…).directed_by(…)`, written out in full, in | ||
| that order, in one expression, is something the compiler recognizes as a call at all. | ||
| - **An incomplete chain is a compile error, not a runtime concern.** There is no "half-built |
There was a problem hiding this comment.
I think forcing a compilation error is not gonna work cleanly, since x.y().z() itself is a valid expression the error can't happen at parser level, so it must be enforced by typeck which the compiler need to invent some type which can only be used if directly followed by the rest of the chain. This can be partially fulfilled by making the intermediate type unsized. With the current situation of having unsized_fn_params minus unsized_locals, you'll get these behavior:
// allowed
let movie_0 = define_movie(t0).release_in(y0).directed_by(n0);
// also allowed (!)
let movie_1 = ( define_movie(t1).release_in(y1) ).directed_by(n1);
// compile error (currently an ICE)
let movie_2 = { define_movie(t2).release_in(y2) }.directed_by(n2);
// allowed, but you can't use `partial_movie_3` to do anything safe.
let partial_movie_3 = &(define_movie(t3).release_in(y3));| fn define_movie<'a>(name: &'a str) | ||
| .released_in(release_year: usize) | ||
| .directed_by(director_name: &'a str) -> Movie |
There was a problem hiding this comment.
This example already strikes me as contentious due to the fact that the two strings share the same lifetime. How do you expect this to work? Should the compiler automatically shorten the lifetimes to be the intersection of the two (failing if this cannot occur), or should it refuse to accept a second string if its lifetime is shorter than the first?
To be clear, I think the former is the right response here, but I would say this example is not nearly as simple as it implies it should be, and already asks a lot of questions that would need to be answered.
And importantly, these are questions that creators of builder-pattern structs need to resolve today, so, this also makes it clear that this way of doing things doesn't necessarily get rid of those problems.
| Method chains avoid this trade-off entirely, while staying purely additive. They introduce a new | ||
| kind of item, they do not change the meaning of any existing function or call expression: | ||
|
|
||
| - They generate no boilerplate beyond the chain's own declaration: there is nothing to design or | ||
| hand-write beyond the chain itself (no separate builder type, no type-level tracking of what's | ||
| been supplied so far). | ||
| - It is checked for completeness entirely at compile time, with no runtime check and no extra | ||
| type-level machinery needed to get there (see Guide-level explanation). | ||
| - The whole chain is documented as a single item, not fragmented across a builder type's page and | ||
| its methods' pages (see Rationale and alternatives for why a macro-based implementation cannot | ||
| achieve this on its own). | ||
|
|
||
| Because this RFC proposes a new kind of item rather than new call syntax for existing functions, | ||
| it sidesteps the two constraints described above that proposals about named/optional parameters | ||
| must consider. Nothing about how `f(a, b, c)` resolves today changes, because the syntax for method | ||
| chains just mimics an ordinary chain of method calls and does not require altering the syntax for | ||
| individual functions/methods. A method chain's own name still resolves to exactly one item, the same way an | ||
| ordinary function's does (see Reference-level explanation): nothing about it depends on which | ||
| arguments are passed, so no question of arity-based overloading ever comes up. This still addresses | ||
| the same underlying need: letting callers supply grouped, named, and conditionally-shaped sets of | ||
| parameters without writing or maintaining a hand-rolled builder. |
There was a problem hiding this comment.
So this is actually something I recently have thought a bit about, and I think that information like this belongs in the rationale of the RFC, not the motivation. Effectively, the motivation is where you pitch the problem, the explanations are the solution, and the rationale is where you justify the solution. This feels like a justification of method chains specifically rather than describing any particular problem.
| Method chains avoid this trade-off entirely, while staying purely additive. They introduce a new | ||
| kind of item, they do not change the meaning of any existing function or call expression: |
There was a problem hiding this comment.
I would disagree mostly for reasons that feel a bit pedantic, but are important. Particularly, while it does not strictly affect any existing code which does not use this feature, it would affect the meaning of future code, because now I need to ask myself whenever I see a method chain, is this a Method Chain, or just a chain of method calls? And this invokes a nontrivial amount of cognitive overhead that should be considered.
For example, let's imagine the method call a().b().c().d().e(). You can add parameters if you think it matters. Currently, there is one canonical way to read this: a() is a function which returns some value (call it A), and then you call A::b, then B::c, etc.
With this new framework, I need to verify that first, there does not exist a method chain called a(), and then depending on the method chain definition, I interpret this differently. Using parentheses to disambiguate method chains, this could be either (a().b()).c().d().e(), (a().b().c()).d().e(), (a().b().c().d()).e(), or (a().b().c().d().e()). And, of course, the original, which could be thought of as a "trivial method chain," but I'm not going to do that here.
And then things get even more complicated if you have optional methods.
Sure, this merely adds something to the language from the perspective of it not breaking past code, but that is a requirement for any new features, not a benefit of any particular feature. It changes how people think about the code, which is not merely additive.
| - Writing only a prefix of a chain's sections (`ident(args0)` alone, `ident(args0).section1(args1)` | ||
| alone, and so on up to, but not including, the full chain) is not a valid expression in any | ||
| position: not as a statement, not as the right-hand side of a `let`, not as a function | ||
| argument, not as a return value. There is no point during resolution or evaluation at which | ||
| "the result so far" exists as a value of any type, nameable or not, that a program could refer | ||
| to, store, or pass around. A chain that is not completed in one expression is a compile-time | ||
| error naming the section the compiler expected to come next. |
There was a problem hiding this comment.
This… is not the kind of semantics I would expect for a feature like this. I would expect that the exact semantics are, an incomplete prefix is an incomplete expression, and will fail to compile.
Specifically, I would expect the following to still work:
macro_rules! wrap {
($x:expr) => { $x }
}
let _ = wrap!(wrap!(a()).b()).c();if a().b().c() is a valid method chain. It's very natural for macros to build up normal method chains like this, and I would expect Method Chains proper to only fail in type-checking.
| - A complete chain invocation behaves exactly as if it were a single call to a function taking | ||
| the concatenation of every section's parameters (plus the receiver, if any) in declared order, | ||
| and running the final section's body. There are accordingly no auto-trait, `Copy`/`Clone`, or | ||
| drop-timing questions to answer for "the result of a non-final section": no such value exists | ||
| for any such property to apply to. The only types that matter to a method chain are each | ||
| section's own parameter types and the chain's final return type, exactly as for an ordinary | ||
| function. |
There was a problem hiding this comment.
How do you refer to the type of such a function? Are extern "C" method chains possible?
(Note: later points make it clear that this is not possible, but I think it's worth pointing out since even if you don't expose this to the user, the compiler will need to come up with some sort of type for it anyway.)
| This RFC defines only this observable behavior, and deliberately imposes no implementation | ||
| strategy beyond it. A compiler may lower a method chain to a single physical function (the most | ||
| direct strategy, since nothing about its semantics requires any intermediate value to exist), to a | ||
| sequence of single-method generic types the way today's `assemblist` macro does, or to anything | ||
| else that reproduces the same observable behavior. Whichever strategy is used, the language | ||
| defines no name for any such intermediate construct: it is never shown to `rustdoc`, never | ||
| participates in name resolution, and is not something a Rust programmer is ever expected to | ||
| reason about. |
There was a problem hiding this comment.
I don't really like this framing because, well, first, we do show this to rustdoc. rustdoc has full access to the compiler internals and therefore will have visibility of whatever compiler representation we choose for this, and it needs to be able to convert that into usable documentation. Therefore, access to the original is required for rustdoc to function in some manner.
| equivalent to calling a single method over `index`, `kind`, and `title` together with the | ||
| `&mut self` receiver. | ||
|
|
||
| ## Drawbacks |
There was a problem hiding this comment.
I mention a pretty clear drawback when responding to an earlier point in the RFC: #3977 (comment)
| Its drawback is exactly what motivated moving away from it: it forces a compiler to actually | ||
| generate, and a chain's intermediate calls to actually produce, a real value of a real type for | ||
| every non-final section, even when, as in every example in this RFC, the whole chain is written | ||
| out in one continuous expression and nothing was ever going to inspect, store, or outlive that | ||
| intermediate value. Committing to that as part of the language's observable behavior also commits | ||
| to answering, for every method chain, questions this RFC currently avoids needing to answer at | ||
| all: what auto traits that intermediate value has, when it is dropped, whether it is `Send`, and | ||
| so on, all for a value whose only legitimate use is to immediately call the next section. |
There was a problem hiding this comment.
To be clear, the compiler is probably going to have to do this anyway just by the nature of how it works today. When parsing the expression, it's going to be identical to a regular method chain before name resolution happens. In order to properly resolve the names, it's going to have to assign a type for the initial call, and keep a status of the various methods chained onto it until it finishes, at which point, it will have a full type for the expression.
It would be ridiculous IMHO to fundamentally change how the compiler works just to accept this one feature. Instead, I would expect this feature to bend to how method chains normally work internally.
| substantially harder to design and to reason about if sections could additionally be taken in any | ||
| order. | ||
|
|
||
| ### Doing nothing |
There was a problem hiding this comment.
I think one obvious thing missing from this section is how all of these individual points could be improved on their own, rather than simply offering this solution as the solution to all of them.
For example, you're right that the builder pattern makes things difficult because everything is spread on separate pages. Could we have some attributes for rustdoc to improve this? Maybe rustdoc could even generate some state diagrams for us?
Introduce method chains: a native item whose parameter list is split across named, dot-separated sections, as an additive alternative to named/optional arguments.
Important
Since RFCs involve many conversations at once that can be difficult to follow, please use review comment threads on the text changes instead of direct comments on the RFC.
If you don't have a particular section of the RFC to comment on, you can click on the "Comment on this file" button on the top-right corner of the diff, to the right of the "Viewed" checkbox. This will create a separate thread even if others have commented on the file too.
Rendered