分享｜C++11 新特性 | 从内存模型角度分析 std::tuple，揭秘异质容器

2909

2021.03.14

2021.04.06

发布于未知归属地

本期聊一个C++11引入的类std::tuple，为更好地理解本期，建议先认真阅读上一期【编译器优化之 Empty Base Class Optimization】。

C++中，std::vector、std::list、std::map/set等容器都只能存储同一种类型，属于同质容器。而类std::tuple，它弥补了 std::pair只能存储两个对象的缺陷，可以和 class / struct一样存储不同类型对象。

本文更好的阅读体验，可以点击：走近 std::tuple，揭秘异质容器

更多硬核知识，vx搜一搜: look_code_art，更多硬核等你发现，

也可以添加个人 vx: fibonaccii_

struct / class中变量可以通过变量名来使用，而std::tuple对象obj中存储的元素，是通过std::get<Index>(obj)函数来获取obj对象的第Index个元素，即通过索引来获取。

int main(int argc, char const *argv[]) {
    
    std:tuple<int, std::string, double> t = std::make_tuple(1, "Foo", 3.14);
    // 基于下标索引获取
    std::cout << "(" << std::get<0>(t) << ", " << std::get<1>(t) << ", " << std::get<2>(t) << ")\n";
  return 0;
}

如果问你std::vector、std::list、std::map/set 等容器的底层实现，你大概地知道std::vector底层是数组，std::list是链表，std::map是红黑树。但是，当问你 std::tuple 的底层实现，你可能一时半会儿想不出它底层是基于何种数据结构，或者什么原理实现。

实际上，std::tuple是基于递归实现，在递归中完成std::tuple对象的构造。比如，在上面demo中，std::tuple对象t的构造过程递如下：

第一层递归。准备存储t的第一个元素1，剩下的是 std:tuple<std::string, double> 对象
第二层递归。准备存储t的第二个元素Foo，剩下的是 std::tuple<double> 对象
第三层递归。准备存储t第三个元素3.14。由于3.14是最后一个元素，递归结束。此时t的才开始存储元素，并且存储元素是从3.14开始，然后递归开始返回。
返回到第2步的递归，存储std::string类型的Foo
返回到第1步的递归，存储第int类型的1。

也就是说，std::tuple 的构造函数中，最后一个传入的元素最先构造，最先传入的元素最后一个构造，符合递归顺序，即入栈顺序。下面从源码角度，具体来看看 std::tuple 的实现。

std::_Tuple_impl

std::tuple 是继承于 std::_Tuple_impl，std::tuple 的主体核心功能都是由 std::_Tuple_impl完成。因此下面着重于分析 std::_Tuple_impl 的构造过程。

 template <typename... _Elements>
 class tuple : public _Tuple_impl<0, _Elements...> { 
  	//...
 };

那么，我们先来看看 std::_Tuple_impl 的实现。在 g++实现的STL中，std::_Tuple_impl 的定义及其部分构造函数如下：

  // _Idx 是非类型模板参数
  // _Head 是第一个节点的类型，
  // _Tail...  是剩余节点的类型
  template <std::size_t _Idx, typename _Head, typename... _Tail>
  struct _Tuple_impl<_Idx, _Head, _Tail...>
  : public _Tuple_impl<_Idx + 1, _Tail...>,
    private _Head_base<_Idx, _Head> 
  {
     typedef _Tuple_impl<_Idx + 1, _Tail...>  _Inherited;  // 父类
     typedef _Head_base<_Idx, _Head> 		 _Base;		   // 当前节点类型		
     
     // 默认构造函数
     constexpr _Tuple_impl() : _Inherited(), _Base() 
     { }
        
     // 逐节点复制
     explicit constexpr _Tuple_impl(const _Head &__head, const _Tail &...__tail)
     : _Inherited(__tail...), _Base(__head) 
     { }
        
     // 逐节点移动
    template <typename _UHead, 
      		 typename... _UTail, 
      		 typename = typename enable_if<sizeof...(_Tail) == sizeof...(_UTail)>::type> // 元素个数要一致
    explicit constexpr _Tuple_impl(_UHead &&__head, _UTail &&...__tail)
    : _Inherited(std::forward<_UTail>(__tail)...),
      _Base(std::forward<_UHead>(__head)) 
    { }
     //... 
  };
  
  // 最后一个节点，_Head 是最后一个节点的类型
  template <std::size_t _Idx, typename _Head>
  struct _Tuple_impl<_Idx, _Head> : private _Head_base<_Idx, _Head> 
  { 
     typedef _Tuple_impl<_Idx + 1, _Tail...>  _Inherited;  // 父类
     typedef _Head_base<_Idx, _Head> 		 _Base;		  // 用于构造当前节点	
   	
     // 构造函数
     constexpr _Tuple_impl() : _Base() 
     { }
      
     // 复制当前节点
     explicit constexpr _Tuple_impl(const _Head &__head) : _Base(__head) 
     { }
      
     // 移动当前节点
     template <typename _UHead>
     explicit constexpr _Tuple_impl(_UHead &&__head)
     : _Base(std::forward<_UHead>(__head)) 
     { }
     //...
  };

首先，可以确定std::_Tuple_impl是个递归类。

std::_Tuple_impl<_Idx, _Tail...> 的父类是 std::_Tuple_impl<_Idx + 1, _Tail...>，最终的父类，也就是递归基是 std::_Tuple_impl<_Idx, _Head> 。

我们知道，在继承体系中，最先构造父类，再构造子类。因此，std::tuple 对象中，最先构造的是最后一个节点，最后构造第一个节点。因此 t 中构造顺序是 3.14 --> "Foo" -->1 ：

 // 复制
 explicit constexpr _Tuple_impl(const _Head& __head, const _Tail&... __tail)
 : _Inherited(__tail...), // 先构造父类 
   _Base(__head)  		  // 再构造当前节点
 { }

 // 移动
template <typename    _UHead, 
          typename... _UTail, 
          typename = typename enable_if<sizeof...(_Tail) == sizeof...(_UTail)>::type> 
explicit constexpr _Tuple_impl(_UHead&& __head, _UTail&&... __tail)
: _Inherited(std::forward<_UTail>(__tail)...), // 先构造父类
  _Base(std::forward<_UHead>(__head)) 		   // 再构造当前节点
{ }

然后，std::_Tuple_impl 的每个递归层次都会继承一个基类 std::_Head_base 。这是因为 std::tuple 中每个节点都是由std::_Head_base 对象表示。比如之前的 t ：

1是由 std::_Head_base<0, int> 对象表示；
"Foo" 是由于 std::_Head_base<1, std::string> 对象表示；
3.14 是由 std::_Head_base<2, double> 对象表示。

std:tuple<int, std::string, double> t = std::make_tuple(1, "Foo", 3.14);

好嘞，下面我们继续看看 std::_Head_base 的设计。

std::_Head_base

现在，我们来仔细谈谈 std::tuple 中存储节点类 std::_Head_base 以及如何优化 std::_Head_base 。

现在来设想一个场景，std::tuple 存储的元素中，存在一个或者几个空类对象。如下：

// Empty
class Foo { 
public:
  Foo() { std::cout << "Foo: "<< this <<std::endl;}
};

class Goo { 
public:
 Goo () { std::cout << "Goo: "<< this <<std::endl;}
};

int main(int argc, char const *argv[]) {

  Goo goo{};
  Foo foo{};
  auto t = std::make_tuple(100, foo, goo, 200);

  sizeof(t); // size ???
  return 0;
}

先猜测下，t占用的内存大小???

在上面，我们已经看到 std::_Tuple_impl 是以递归的方式完成的。如果 std::tuple 存储的元素中存在空类对象，比如上面的 foo、goo，那么此时整个继承结构如下：

std::_Tuple_impl<3, int> 是 std::_Head_base<3, int> 的子类，依赖 std::_Head_base<3, int> 来存储int对象200；
std::_Tuple_impl<2, Goo, int> 是 std::_Tuple_impl<3, int>、std::_Head_base<2, Goo>的子类，依赖std::_Head_base<2, Goo> 来存储 goo 。
std::_Tuple_impl<1, Foo, Goo, int> 是 std::_Tuple_impl<2, Goo, int>、std::_Head_base<1, Foo> 的子类，依赖 std::_Head_base<1, Foo> 来存储 foo。
std::_Tuple_impl<0, int, Foo, Goo, int> ，则是 std::_Tuple_impl<1, Foo, Goo, int> 子类，依赖于 std::_Head_base<0, int>来存储100。

所以呢，实际上整个继承体系中有以下四个空基类，用于存储元素：

std::_Head_base<3, int>
std::_Head_base<2, Goo>
std::_Head_base<1, Foo>
std::_Head_base<0, int>

但是我们知道，foo、goo都是空类对象，如果按照【编译器优化之 Empty Base Class Optimization】开篇中提到的继承体系中的内存模型，那么此时t 的内存模型如 Tuple_eq 所示：

class Tuple_eq { 
public:
  Tuple(): val_2_{200}, goo_{}, foo_{}, val_1_{100}
  { }
    
  //... 
private:
  int val_2_;
  Goo goo_;
  Foo foo_;
  int val_1_;
};

sizeof(Tuple_eq); // 12 个字节

自然，sizeof(Tuple_eq)会是12，白白浪费了4个字节。

由于Goo、Foo都是空类，而且是不同的类型（包括没有继承关系），那么编译器机就会基于【编译器优化之 Empty Base Class】一文中提到的空基类优化技术来优化 t 的内存模型，最终变成 Tuple_opti 的内存模型：

class Tuple_opti { 
public:
  Tuple_opti()=default;
    
  //... 
private:
  int val_2_{2};
  int val_1_{1};
};

sizeof(Tuple_opti); // 8个字节

毫无疑问，此时 t 的大小就是8个字节。

上面都是在讨论利用EBCO来优化 std::_Tuple_impl 。下面我们把目光转向 std::_Head_base ，因为能不能令编译器开启EBCO，关键还是在于 std::_Head_base 的实现。std::_Head_base 的类模板原型如下：

  template <std::size_t _Idx, 
            typename _Head,
            bool = __empty_not_final<_Head>::value>
  struct _Head_base;

_Head 即待存储的元素类型，比如上面demo中的int、Foo、Goo。
__empty_not_final 用来判断传入元素类型 _Head 是否能令编译器开启EBCO。

从【编译器优化之 Empty Base Class Optimization】一文可知，要想令编译器开启EBCO，那么 std::_head_base 需要继承空类 _Head。不是类（比如内置类型、union等）、以及无法继承的类，都不能开启EBCO。 __is_empty_non_tuple 的实现如下：

  // _Tp 即 上面的 _Head
  template <typename _Tp>
  struct __is_empty_non_tuple : is_empty<_Tp>
  { };

  //  __empty_not_final 用于判断没有final修饰的空类，是否能开启EBCO 
  // 	有 final修饰， 则直接为 false_type
  // 	没有final修饰，则取决传入的类型 _Head 是否为空类
  template <typename _Tp>
  using __empty_not_final = typename conditional<__is_final(_Tp),false_type,__is_empty_non_tuple<_Tp>>::type;

空基类优化时的 std::_head_base

当传入的_Head 是无final修饰的空类时，__empty_not_final<_Head> 是 True_type 类型。那么 std::_head_base 使用EBCO来优化设计，即继承空基类 _Head ，这样才能令 std::_Tuple_impl 在继承体系中，不为空基类对象分配内存。

  // 特化版本： _Head 是可继承的空类
  template <std::size_t _Idx, typename _Head>
  struct _Head_base<_Idx, _Head, true> : public _Head
  {
    constexpr _Head_base() : _Head() {}
    constexpr _Head_base(const _Head& __h) : _Head(__h) 
    { }
    template <typename _UHead>
    constexpr _Head_base(_UHead&& __h) : _Head(std::forward<_UHead>(__h)) 
    { }
    constexpr _Head_base(const _Head_base &) = default;
    constexpr _Head_base(_Head_base &&) = default;
      
    //...其他构造函数

    // 获取
    static constexpr 
    _Head& _M_head(_Head_base& __b) noexcept { return __b; }
    static constexpr const 
    _Head& _M_head(const _Head_base& __b) noexcept { return __b; }
    
    // 无成员变量
  };

无优化的 std::_head_base

当传入非空类、内置类型、union类型时，那么就需要为这个对象分配内存。这个道理很简单，比如你为int类型分配一个int变量，不能再使用继承int类型，当然int类型也无法继承：

struct IntDemo { 
	IntDemo() = defalut;

	int var{0};
};

那么这个时候的就要使用 has-a 来设计 std::_Head_base ，即如下：

  template <std::size_t _Idx, typename _Head>
  struct _Head_base<_Idx, _Head, false>
  {
    // 默认构造函数
    constexpr _Head_base() : _M_head_impl() {}
    // 传入的是 _Head 类型的左值
    constexpr _Head_base(const _Head &__h) : _M_head_impl(__h) { }
	// 传入的是 _head 类型的右值
    template <typename _UHead>
    constexpr _Head_base(_UHead &&__h) : _M_head_impl(std::forward<_UHead>(__h)) {}
	
    // 复制、移动构造函数
    constexpr _Head_base(const _Head_base& ) = default;
    constexpr _Head_base(_Head_base &&) = default;

    //...其他构造函数
	
    static constexpr
    _Head&  _M_head(_Head_base& __b) noexcept { return __b._M_head_impl; }
    static constexpr const 
    _Head& _M_head(const _Head_base &__b) noexcept { return __b._M_head_impl; }

    // 不使用空基类优化，则定义成员变量
    _Head _M_head_impl;
  };

此时，在 std::_Head_base 中 has-a 传入的 _Head 类型对象 _M_head_impl ，然后在构造函数中对 _M_head_impl 进行初始化。

顺便解释下，std::_Head_base中的复制、移动构造函数皆为默认的原因，这样就直接取决于 _Head 是否实现了复制、移动构造函数。换句话说，如果 _Head 实现了复制、移动构造函数，那么使用 std::_Head_base<I_dx, _Head> 的默认 ctor 、 mtor会直接使用_Head的 ctor、 mtor 。

    constexpr _Head_base(const _Head_base& ) = default;
    constexpr _Head_base(_Head_base &&) = default;

好嘞，到此，相信你对 std::tuple 的底层实现以及空基类优化应该有了更深的理解。那么，下面我们回顾下 std::tuple 的设计。

可以说，std::tuple 就是想和 struct/class 一样，能存储不同类型的对象，而不是像 std::vector 等同质容器只能存储同类型对象。但是问题是，struct/class 的使用者，知道自己要存储什么类型的变量，以及要存储几个变量。

但是 std::tuple 的设计者不知调用者要存储什么类型的变量，也不知道要存储几个。但是也要实现和 struct/class 一样的功能，怎么办？自然就是递归了，通过递归将 std::tuple 构造函数中的参数按照入栈的顺序生成内存模型，比如：

auto t = std::make_tuple(1,2,3);

// 等效内存模型
class Tuple_eq { 
public:
  Tuple_eq() = default;
  // ...
private:
  int val_3_{3};
  int val_2_{2};
  int val_1_{1};
};

这样，我们就理解了 std::tuple 的设计目标以及它的具体设计。

EBCO失效解决办法

最后，我们再来回顾【编译器优化之 Empty Base Class Optimization】中遗留的一个问题：EBCO也有不适用的场景，即子类继承的多个父类中，存在同类型时，会导致EBCO会失效。

// T1、T2不能是同一个类型
template<typename T1, typename T2>
class Foo : private T1, T2 { 
public:
  // ...
};

那么，std::_Tuple_impl 是如何解决这个问题？STL给 std::_Tuple_impl 加上了一个非类型模板函数 _Idx：

  // _Idx 是非类型模板参数
  template <std::size_t _Idx, typename _Head, typename... _Tail>
  struct _Tuple_impl<_Idx, _Head, _Tail...>
  : public _Tuple_impl<_Idx + 1, _Tail...>,
    private _Head_base<_Idx, _Head> 
  {
  	//....
  }

现在，先假设没有这个非类型模板参数 _Idx。

每个 _Tuple_impl<_Head, _Tail...> 类都需要继承 std::_Head_base<_Head> 来开启EBCO，但是如果存在两个一样的空基类 _Head ，就会导致子类继承了两个一样类型空基类，进而导致EBCO失效：

auto t_2 = std::make_tuple(1, Foo{}, Foo{});

// 解释t_2的继承关系 
std::_Tuple_impl<int, Foo, Foo> 继承 std::_Tuple_impl<Foo, Foo> 和 std::_Head_base<int>
std::_Tuple_impl<Foo, Foo> 		继承 std::_Tuple_impl<Foo> 和 std::_Head_base<Foo>
std::_Tuple_impl<Foo>		    继承 std::_Head_base<Foo>

由于 std::_Tuple_impl<Foo> 继承了 std::_Head_base<Foo>，而 std::_Tuple_impl<Foo, Foo>又继承了std::_Tuple_impl<Foo> 和 std::_Head_base<Foo> ，这就导致编译器无法为 std::_Tuple_impl<Foo, Foo> 开启EBCO，要占用两个字节。

但是加上了非类型模板参数_Idx，情况就不一样了：

auto t_2 = std::make_tuple(1,Foo{}, Foo{});

// 解释t_2的继承关系 
std::_Tuple_impl<0, int, Foo, Foo>  继承 std::_Tuple_impl<1, Foo, Foo> 和 std::_Head_base<0, int>
std::_Tuple_impl<1, Foo, Foo> 		继承 std::_Tuple_impl<2, Foo> 和 std::_Head_base<1, Foo>
std::_Tuple_impl<2, Foo>		    继承 std::_Head_base<2, Foo>

现在， std::_Tuple_impl<2, Foo> 继承的空基类是 std::_Head_base<2, Foo> ， std::_Tuple_impl<1, Foo, Foo> 继承的是另一个空基类 std::_Head_base<1, Foo>。很明显，std::_Head_base<1, Foo> 和 std::_Head_base<2, Foo> 不是一个类型，这样使得编译器就能继续开启EBCO，节省内存。

所以，现在可以回答下面demo中t_2占用内存大小了，4个字节！！！。

int main(int argc, char const *argv[]) {

  auto t_2 = std::make_tuple(1, Foo{}, Foo{});
  std::cout<< sizeof(t_2)<<std::endl;;
  return 0;
}

// 编译输出
$ g++ -g -O0  tuple.cc -o t  && ./t
Foo: 0x7fffe3d80f9f
Foo: 0x7fffe3d80f9e
4

by the way

在分析这个demo时，发现vscode的智能提示的BUG。可能是因为std::tuple 内部设计较为复杂，智能提示无法推理出吧。

终于写完了，花了好大的精力才理清关系，如此硬核的内容，给个赞吧。