I'm trying to implement the simple threadpool with the as low overhead as possible (preferably at C++14).
The general idea is to "pack" tasks to lambda without arguments (but with capture list), and provide a single public function to add tasks to threadpool.
Let's consider the following simple (and ugly) example, illustrating the problem (it's not the threadpool itself, but the simplified code snippet with helper class VerboseStr
).
#include <iostream>
#include <string>
#include <functional>
#include <thread>
// Small helper macros
#define T_OUT(str) std::cout << Ident() \
<< "[" << std::to_string(id_) << "] " \
<< str << std::endl; \
#define T_OUT_FROMTO(str) std::cout << Ident() \
<< "[" << std::to_string(i.id_) << "->" << std::to_string(id_) << "] " \
<< str << std::endl; \
#define T_SLEEP(ms) std::this_thread::sleep_for(std::chrono::milliseconds(ms));
// Just a verbose class, holding std::string inside
/////////////////////////////////////////////////////////
class VerboseStr
{
std::string val_;
int id_;
static int GetId() { static int id = 0; return ++id; }
// Several spaces to ident lines
std::string Ident() const { return std::string(id_, ' '); }
public:
VerboseStr() : id_(GetId())
{
T_OUT("Default constructor called");
};
~VerboseStr()
{
val_ = "~Destroyed!";
T_OUT("Destructor called");
}
VerboseStr(const std::string& i) : val_(i), id_(GetId())
{
T_OUT("Create constructor called");
};
VerboseStr(const VerboseStr& i) : val_(i.val_), id_(GetId())
{
val_ = i.val_;
T_OUT_FROMTO("Copy constructor called");
};
VerboseStr(VerboseStr&& i) noexcept : val_(std::move(i.val_)), id_(GetId())
{
T_OUT_FROMTO("Move constructor called");
};
VerboseStr& operator=(const VerboseStr& i)
{
val_ = i.val_;
T_OUT_FROMTO("Copy operator= called");
return *this;
}
VerboseStr& operator=(VerboseStr&& i) noexcept
{
val_ = std::move(i.val_);
T_OUT_FROMTO("Move operator= called");
return *this;
}
const std::string ToStr() const { return std::string("[") + std::to_string(id_) + "] " + val_; }
void SetStr(const std::string& val) { val_ = val; }
};
/////////////////////////////////////////////////////////
// Capturing args by VALUES in lambda
template<typename Fn, typename... Args>
void RunAsync_V(Fn&& func, Args&&... args)
{
auto t = std::thread([func_ = std::forward<Fn>(func), args...]()
{
T_SLEEP(1000); // "Guarantees" async execution
func_(args...);
});
t.detach();
}
void DealWithVal(VerboseStr str)
{
std::cout << "Str copy: " << str.ToStr() << std::endl;
}
void DealWithRef(VerboseStr& str)
{
std::cout << "Str before change: " << str.ToStr() << std::endl;
str.SetStr("Changed");
std::cout << "Str after change: " << str.ToStr() << std::endl;
}
// It's "OK", but leads to 2 calls of copy constructor
// Replacing 'str' with 'std::move(str)' leads to no changes
void Test1()
{
VerboseStr str("First example");
RunAsync_V(&DealWithVal, str);
}
// It's OK
void Test2()
{
VerboseStr str("Second example");
RunAsync_V(&DealWithRef, std::ref(str));
// Waiting for thread to complete...
T_SLEEP(1500);
// Output the changed value of str
std::cout << "Checking str finally: " << str.ToStr() << std::endl;
}
int main()
{
Test1();
// Test2();
T_SLEEP(3000); // Give a time to finish
}
As said in comment above, the problem is in Test1()
function.
It is obvious that in the context of Test1()
, the only possible way to asynchronously call a function DealWithVal
is to "move" str
to a lambda body.
When Test1()
called from main()
, the output is following:
[1] Create constructor called
[1->2] Copy constructor called
[2->3] Move constructor called
[2] Destructor called
[1] Destructor called
[3->4] Copy constructor called
Str copy: [4] First example
[4] Destructor called
[3] Destructor called
As we can see, there are 2 calls of copy constructor.
I cannot realize how to implement this, considering that the passing by value (without moving) and by reference (take a look at Test2()
) should be available as well.
Please help sort out the problem. Thanks in advance.
Your two copies are coming from:
auto t = std::thread([func_ = std::forward<Fn>(func), args...]()
// ^^^^^^^ here
{
T_SLEEP(1000); // "Guarantees" async execution
func_(args...);
// ^^^^^^^ and here
You copy into the lambda, which is good, and which is all you can do since you were passed an lvalue reference. Then you copy the args from the lambda into the by-value function. But by design, you own the args now and the lambda has no further purpose, so you should move them from the lambda. i.e.:
auto t = std::thread([func_ = std::forward<Fn>(func), args...]() mutable
// ^^^^^^^
{
T_SLEEP(1000); // "Guarantees" async execution
func_(std::move(args)...);
// ^^^^^^^^^^^^^^^
This reduces you to the one necessary copy.
The other answer implicitly wraps passed lvalues in reference_wrappers to get zero copies. The caller is required to maintain the value for the async lifetime, but there is no explicit record of that at the call-site. It runs counter to expectations from analogous functionality (e.g. std::thread applies a decay_copy and requires the caller to wrap in a reference wrapper if that's what they want).
Gattet, thank you for parcipiation! Your version gave me new thoughts. Please, take look at the following - link. At this time, I'm trying to simplify solution - the RunAsync_V2, which accepts parameters by VALUE itself (which, in particular, forces caller to use std::ref when appropriate, but it's OK for my case). But I cannot realize, how to "move" parameter pack in lambda capture in C++14. The description is in comments in
Test1()
Sorry, I typed your name incorrectly in the previous comment :( PS. If "moving" parameter pack is not possible in C++14 at all - I'll prefer to provide several versions of
RunAcync
in my threadpool, allowing to pass 1,2,3... parameters. But of course, it's always better to have more general solution.Ideally, what you want to write would have
RunAsync
take forwarding referencesArgs&&... args
, so as to keep references as long as possible, and you would initialize the lambda captures by-value with forwarding[func_ = std::forward<Fn>(func), ...args_ = std::forward<Args>(args)]() mutable
resulting in a move into the lambda when possible, and copy if not. Then in the lambda you'd move into the callstd::move(func_)(std::move(args_)...)
(or even betterstd::invoke
). You can write this, but in C++20. :(The variadic lambda initializer is what is not available in C++14. You can however write basically the same thing by smuggling through a tuple:
[func_ = std::forward<Fn>(func), args_ = std::tuple<Args...>(std::forward<Args>(args)...)]() mutable
. This is very close to the updated answer from Jarod42 (which has Run taking by value, and moving into the lambda, resulting in an additional move relative to taking by reference and forwarding into the lambda -- a minor difference).Then of course, inside the lambda, you need something like
std::apply
which is in C++17:apply(std::move(func_), std::move(args_))
(I don't think I'd bother trying to emulatestd::invoke
before C++17, but you could depending on your needs). I'm not sure why Jarod42 wraps the call insideapply
with a lambda. For implementingapply
, you can copy the example implementation from cppreference,for example (replacestd::invoke
with a call expression). Happy coding!