PingCAP Style Guide
Guide to writing idiomatic code at PingCAP and in the TiKV project.
Project maintained by pingcap
Hosted on GitHub Pages — Theme by mattgraham
Functions, methods, and traits
- Conversions should use conversion traits where possible, rather than ad hoc methods.
E.g., implement
From<T>
or TryFrom<T>
, rather than have an inherent from_T
or into_T
method.
- Where there is a conversion from
T
to U
and where T
has more information than U
, put conversion methods on T
and not U
.
Where there is a (lossy) conversion from U
to T
, put that on T
too (presumably such a conversion might fail, or will need extra information, so you might not be able to use a standard trait).
- Rationale: prevents having many conversions on the ‘lowest information’ types.
Prevents code duplication.
See also the API guide.
- Where there is a clear receiver (
self
), prefer a method to a function.
- Rationale: methods are usually easier to use than functions, see the API guide
There is often a choice between using a type’s concrete name and using Self
, e.g., in function signatures or when qualifying names.
In general, use Self
when the meaning is ‘the current type’, and the explicit type name when the meaning is ‘that specific type’.
Some examples:
- Constructors should usually use the explicit type name.
- Calls to private static methods should usually use
Self
.
- Methods which expect an object of the same type, e.g., comparisons, should use
Self
.
If in doubt, use the explicit type name.
Functions
- Prefer short functions and methods.
- The shorter the better, but as an upper limit, you should nearly always be able to read a whole function without scrolling on a regular monitor.
- Split a large function into several smaller ones.
- If that is difficult, it is often possible to factor out a data structure and turn the large function into small methods.
- Don’t use
#[inline]
(or never
or always
) unless benchmarking shows an improvement.
- When creating an object, a constructor with no arguments should be an implementation of
Default
.
The most common constructor should be an inherent, static method called new
.
If there are one or two other constructors with clear roles, they should be inherent, static methods with a with_
prefix.
If there is more complexity, prefer to use the builder pattern.
- Functions which take ownership of an argument, should take that argument by value or as a
Box
.
They should not take a reference and clone.
- Rationale: avoid unnecessary cloning.
- Functions should prefer to take a borrowed reference, unless they need to take ownership of some form.
- Functions should prefer to return an object by value, rather than using a
Box
, Rc
, or other smart pointer.
- Rationale: increases flexibility because the caller can decide whether to keep the object by value or pointer, without unnecessary allocation.
- Consider if arguments should be generic - using generics and trait bounds rather than concrete types can make functions more flexible and testable, but can also make them more complicated and harder to read.
Do not use generic types unless there is a justification.
Avoid using generic types to make converting types more implicit.
E.g.,
fn foo(x: impl Into<bool>)
allows foo
to be called with either foo(None)
or foo(true)
which saves the caller writing Some(true)
or true.into()
.
However, the into
is still required, it is just inside the function, and the ergonomic improvement of true
over Some(true)
is minimal, perhaps even negative.
Prefer to convert types at the place in the API where it makes logical sense, and to embrace the use of Option
and similar types.
- Arguments should prefer to take a custom enum which conveys meaning, rather than using a
bool
or Option
.
- Rationale: makes code more readable, future proof.
- See also the API guide.
- Prefer to use the type system to validate function arguments
- If not possible, validate arguments dynamically.
- See also the API guide.
- If using
Read
or Write
bounds, take the argument by value rather than (mutable) reference.
Methods
- Do not implement getters and setters unless necessary; prefer public fields.
- If a function has no clear receiver and primarily creates a data type, it should usually be a static method of that data type.
I.e., All constructor functions should be static methods.
- Private static methods should usually be free functions (modules, not types, are the primary privacy boundaries in Rust).
- Public functions should be static methods if and only if they are logically part of the type’s API, e.g., constructor functions.
Generics
- Consider the trade-off between generics (or
impl Tr
) and object types (dyn Tr
).
Generics are usually, but not always, better (see this blog post for details).
- For generics with a single, simple bound, prefer using
impl
syntax.
- For generics with complex bounds, prefer using a
where
clause.
- Use lifetime elision wherever possible (using 2018 rules).
- Function types should always be in
where
clauses.
- Use the most flexible function type you can.
I.e., prefer to use
Fn
over FnMut
over FnOnce
for argument types.
- Aside, if you ever need to implement a function trait, prefer to implement
FnOnce
over FnMut
over Fn
.
- More than about three generic parameters are an indication that the function might be better factored as multiple functions or a data type, or that some generic parameters might be better as associated types.
Flexible function arguments
For each of these flavours of flexible function arguments, there is a trade-off.
The generic form is more flexible.
However, the generic form makes the function signature more difficult to read, can cause code bloat, adds boilerplate to the signature and body, and in some rare cases generates less performant code.
Some things to consider:
- Is the function public API?
If so, prefer the more flexible version, otherwise prefer the simple version.
- Are there places in existing code where the function is called with conversion at the callsite?
If so, prefer the flexible version if it reduces code duplication.
- Is the simple type commonly found in different flavours?
E.g.,
str
has OsStr
and CStr
variations, and the owned version (String
) is common; if you intend to iterate a collection, it is common to pass both the collection and an iterator.
In these cases, prefer the flexible version.
- Otherwise, prefer the simple version (rationale: KISS, YAGNI).
Some examples of this theme:
impl AsRef<T>
is a more flexible alternative to &T
.
impl Into<T>
is a more flexible alternative to T
.
- a special case of this is
IntoIterator
.
Cow
lets a function take either an owned or borrowed value.
- A special case is
Cow<'static, T>
, which takes either an owned or static value, it is commonly used with str
and/or in places which commonly take a const
or static
value, but rarely take a custom value (perhaps for testing).
Using Cow
here prevents an unnecessary allocation in the common case.
Traits
- Use traits for most aspects of a type’s behaviour; reserve inherent methods for actions on the concrete data.
- Prefer small traits which model a single aspect of behaviour.
- Think of traits as aspects of behaviour, rather than collections of functionality.
- Use the extension trait pattern to offer core and extended behaviour.
- Useful for extending traits in a downstream crate.
- Useful for having a core trait which is object-safe and a larger trait with extended behaviour.
- An extension trait to
Foo
should be called FooExt
if it is always valid.
- See RFC 445 for more details.
- Group all required methods together, then group all provided methods together.
Associated types
Generic type parameters and associated types are not the same.
Generic type parameters are chosen by the user of a trait.
Associated types are chosen by the implementer of a trait.
For more details see the book.
It is often useful to supply a default type for associated types.
If you add an associated type to a trait, using a default type makes the change backwards compatible.
<< Implementing traits | Macros >>