Want Speed, Pass by Value

Embed Size (px)

Citation preview

  • 7/25/2019 Want Speed, Pass by Value

    1/5

    Want Speed? Pass by Value. C++Next

    Want Speed? Pass by Value.

    This entry is part of a series, RValue References: Moving Forward

    Be honest: how does the following code make you feel?

    std::vector get_names();

    std::vector const names = get_names();

    Frankly, even though I should know better, it makes me nervous. In principle, when get_names()returns, we

    have to copy a vector of strings. Then, we need to copy it again when we initialize names, and we need to

    destroy the first copy. If there are N strings in the vector, each copy could require as many as N+1 memory

    allocations and a whole slew of cache-unfriendly data accesses as the string contents are copied.

    Rather than confront that sort of anxiety, Ive often fallen back on pass-by-reference to avoid needless copies:

    get_names(std::vector& out_param );

    std::vector names;

    get_names( names );

    Unfortunately, this approach is far from ideal.

    The code grew by 150%

    Weve had to drop const-ness because were mutating names.As functional programmers like to remind us, mutation makes code more complex to reason about by

    undermining referential transparency and equational reasoning.

    We no longer have strict value semantics for names.

    But is it really necessary to mess up our code in this way to gain efficiency? Fortunately, the answer turns out to

    be no (and especially not if you are using C++0x). This article is the first in a series that explores rvalues and

    their impliciations for efficient value semantics in C++.

    RValues

    Rvalues are expressions that create anonymous temporary objects. The name rvaluerefers to the fact that an

    rvalue expression of builtin type can only appear on the right-hand side of an assignment. Unlike lvalues, which,

    when non-const, can always be used on the left-hand-side of an assignment, rvalue expressions yield objects

    without any persistent identity to assign into.

    The important thing about anonymous temporaries for our purposes, though, is that they can only be used once

    in an expression. How could you possibly refer to such an object a second time? It doesnt have a name (thus,

    anonymous) and after the full expression is evaluated, the object is destroyed (thus, temporary)!

    Once you know you are copying from an rvalue, then, it should be possible to steal the expensive-to-copy

    1

    2

    https://archive.is/o/ROtoH/web.archive.org/web/20141011084338/http://web.archive.org/web/20140113221447/http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/%23fn:sakshttps://archive.is/o/ROtoH/web.archive.org/web/20141011084338/http://web.archive.org/web/20140113221447/http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/%23fn:1https://archive.is/o/ROtoH/web.archive.org/web/20141011084338/http://web.archive.org/web/20140113221447/http://www.haskell.org/haskellwiki/Functional_programming%23Purity
  • 7/25/2019 Want Speed, Pass by Value

    2/5

    resources from the source object and use them in the target object without anyone noticing. In this case that

    would mean transferring ownership of the source vectors dynamically-allocated array of strings to the target

    vector. If we could somehow get the compiler to execute that move operation for us, it would be cheapalmost

    freeto initialize names from a vector returned by-value.

    That would take care of the second expensive copy, but what about the first? When get_namesreturns, in

    principle, it has to copy the functions return value from the inside of the function to the outside. Well, it turns out

    that return values have the same property as anonymous temporaries: they are about to be destroyed, and

    wont be used again. So, we could eliminate the first expensive copy in the same way, transferring the

    resources from the return value on the inside of the function to the anonymous temporary seen by the caller.

    Copy Elision and the RVO

    The reason I kept writing above that copies were made in principle is that the compiler is actually allowed to

    perform some optimizations based on the same principles weve just discussed. This class of optimizations is

    known formally as copy elision. For example, in the Return Value Optimization (RVO), the calling function

    allocates space for the return value on its stack, and passes the address of that memory to the callee. The

    callee can then construct a return value directly into that space, which eliminates the need to copy from insideto outside. The copy is simply elided, or edited out, by the compiler. So in code like the following, no copies

    are required:

    std::vector names = get_names();

    Also, although the compiler is normally required to make a copy when a function parameter ispassedby value

    (so modifications to the parameter inside the function cant affect the caller), it is allowed to elide the copy, and

    simply use the source object itself, when the source is an rvalue.

    12

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    std::vector

    sorted(std::vector names)

    {

    std::sort(names);

    return names;

  • 7/25/2019 Want Speed, Pass by Value

    3/5

    }

    // names is an lvalue; a copy is required so we don't modify names

    std::vector sorted_names1 = sorted( names );

    // get_names() is an rvalue expression; we can omit the copy!

    std::vector sorted_names2 = sorted( get_names() );

    This is pretty remarkable. In principle, in line 12 above, the compiler can eliminate allthe worrisome copies,

    making sorted_names2the same objectas the one created in get_names(). In practice, though, the principle

    wont take us quite that far, as Ill explain later.

    Implications

    Although copy elision is never required by the standard, recent versions of every compiler Ive tested do

    perform these optimizations today. But even if you dont feel comfortable returning heavyweight objects by

    value, copy elision should still change the way you write code.

    Consider this cousin of our original sorted() function, which takes names by const reference and makes an

    explicit copy:

    std::vector

    sorted2(std::vector const& names) // names passed by reference

    {

    std::vector r(names); // and explicitly copied

    std::sort(r);

    return r;

    }

    Although sorted and sorted2seem at first to be identical, there could be a huge performance difference if a

    compiler does copy elision. Even if the actual argument to sorted2 is an rvalue, the source of the copy, names,

    is an lvalue, so the copy cant be optimized away. In a sense, copy elision is a victim of the separate

    compilation model: inside the body of sorted2, theres no information about whether the actual argument to the

    function is an rvalue outside, at the call site, theres no indication that a copy of the argument will eventually be

    made.

    That realization leads us directly to this guideline:

    Guideline: Dont copy your function arguments. Instead, pass them by value and let the compiler do the

    copying.

    At worst, if your compiler doesnt elide copies, performance will be no worse. At best, youll see an enormous

    performance boost.

    One place you can apply this guideline immediately is in assignment operators. The canonical, easy-to-write,

    always-correct, strong-guarantee, copy-and-swap assignment operator is often seen written this way:

    T& T::operator=(T const& x) // x is a reference to the source

    {

    3

    https://archive.is/o/ROtoH/web.archive.org/web/20141011084338/http://web.archive.org/web/20140113221447/http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/%23fn:named
  • 7/25/2019 Want Speed, Pass by Value

    4/5

    T tmp(x); // copy construction of tmp does the hard work

    swap(*this, tmp); // trade our resources for tmp's

    return *this; // our (old) resources get destroyed with tmp

    }

    but in light of copy elision, that formulation is glaringly inefficient! Its now obvious that the correct way to write

    a copy-and-swap assignment is:

    T& operator=(T x) // x is a copy of the source; hard work already done

    {

    swap(*this, x); // trade our resources for x's

    return *this; // our (old) resources get destroyed with x

    }

    Reality Bites

    Of course, lunch is never really free, so I have a couple of caveats.

    First, when you pass parameters by reference and copy in the function body, the copy constructor is called from

    one central location. However, when you pass parameters by value, the compiler generates calls to the copy

    constructor at the site of eachcall where lvalue arguments are passed. If the function will be called from many

    places and code size or locality are serious considerations for your application, it could have a real effect.

    On the other hand, its easy to build a wrapper function that localizes the copy:

    std::vector

    sorted3(std::vector const& names)

    {

    // copy is generated once, at the site of this call

    return sorted(names);

    }

    Since the converse doesnt holdyou cant get back a lost opportunity for copy elision by wrappingI

    recommend you start by following the guideline, and make changes only as you find them to be necessary.

    Second, Ive yet to find a compiler that will elide the copy when a function parameter is returned, as in our

    implementation of sorted. When you think about how these elisions are done, it makes sense: without some

    form of inter-procedural optimization, the caller of sortedcant know that the argument (and not some other

    object) will eventually be returned, so the compiler must allocate separate space on the stack for the argumentand the return value.

    If you need to return a function parameter, you can still get near-optimal performance by swapping into a

    default-constructed return value (provided default construction and swap are cheap, as they should be):

    std::vector

    sorted(std::vector names)

    {

    std::sort(names);

    std::vector ret;

  • 7/25/2019 Want Speed, Pass by Value

    5/5

    swap(ret, names);

    return ret;

    }

    More To Come

    Hopefully you now have the ammunition you need to stave off anxiety about passing and returning nontrivial

    objects by value. But were not done yet: now that weve covered rvalues, copy elision, and the RVO, we have

    all the background we need to attack move semantics, rvalue references, perfect forwarding, and more as we

    continue this article series. See you soon!

    Follow this linkto the next installment.

    Acknowledgements

    Howard Hinnant is responsible for key insights that make this article series possible. Andrei Alexandrescu was

    posting on comp.lang.c++.moderated about how to leverage copy elision years before I took it seriously. Most of

    all, though, thanks in general to all readers and reviewers!

    1. Googling for a good definition of value semantics turned up nothing for me. Unless someone else can point

    to one (and maybe even if they can), well be running an article on that topicin which I promise you a

    definitionsoon.

    2. For a detailed treatment of rvalues and lvalues, please see this excellent article by Dan Saks

    3. Except for enums and non-type template parameters, every value with a name is an lvalue.

    Posted Saturday, August 15th, 2009 under Value Semantics.

    https://archive.is/o/ROtoH/web.archive.org/web/20141011084338/http://web.archive.org/web/20140113221447/http://cpp-next.com/archive/category/value-semantics/https://archive.is/o/ROtoH/web.archive.org/web/20141011084338/http://web.archive.org/web/20140113221447/http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/%23fnref:namedhttps://archive.is/o/ROtoH/web.archive.org/web/20141011084338/http://web.archive.org/web/20140113221447/http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/%23fnref:sakshttps://archive.is/o/ROtoH/web.archive.org/web/20141011084338/http://web.archive.org/web/20140113221447/http://www.embedded.com/story/OEG20010518S0071https://archive.is/o/ROtoH/web.archive.org/web/20141011084338/http://web.archive.org/web/20140113221447/http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/%23fnref:1https://archive.is/o/ROtoH/web.archive.org/web/20141011084338/http://web.archive.org/web/20140113221447/http://cpp-next.com/archive/2009/09/move-it-with-rvalue-references/