spdlog的一些坑


最近在用C++的这个spdlog库中遇到了一些坑……在这里写一下

线程不安全

这个库宣称自己是线程安全的,实际上它做的是:


// 我们调用:
logger->info("output {}", (void*)ptr);

// 它做的:
std::tm tm;
localtime_r(&time_tt, &tm);
CharBuffer time = asctime(&tm);
CharBuffer s = time + ("output ");
CharBuffer arg_buffer;
std::ostream os(arg_buffer);
os << (void*)ptr;
s.append(arg_buffer);
{
    std::lock_guard lk(stdout_mutex);
    std::cout << s.str();
}
// 最终输出:
// 2017-02-01 21:00:00 output 0xdeadbeef

这里有两个地方不安全

localtime_r

├───────────────┼───────────────┼─────────────────────────────────┤
│ctime_r(),     │ Thread safety │ MT-Safe env locale              │
│gmtime_r(),    │               │                                 │
│localtime_r(), │               │                                 │
│mktime()       │               │                                 │
├───────────────┼───────────────┼─────────────────────────────────┤

localtime_r虽然是MT-Safe的,但它对env, locale的读取没有做任何同步措施。因此中途如果另一个线程对env/locale发生了修改,可能会发生各种各样神奇的事故。当然,这个是次要的问题,更主要的问题在下面。

operator <<(ostream&, void *const)

大家可能常常觉得只要cout那里锁住了就好,实际上并不如此。上面这段代码用valgrind –tool=drd跑一下就会出现如下警告:

==11851== ----------------------------------------------------------------
==11851== 
==11851== Possible data race during read of size 1 at 0x51C30B9 by thread #3
==11851== Locks held: none
==11851==    at 0x4F500CB: widen (locale_facets.h:875)
==11851==    by 0x4F500CB: widen (basic_ios.h:450)
==11851==    by 0x4F500CB: fill (basic_ios.h:374)
==11851==    by 0x4F500CB: std::ostream& std::ostream::_M_insert(long) (ostream.tcc:73)
==11851==    by 0x40CAD3: operator<<(std::ostream&, UserDefinedObject const&) (42.cpp:14)
==11851==    by 0x422A33: void fmt::format_arg<char, fmt::ArgFormatter, UserDefinedObject>(fmt::BasicFormatter<char, fmt::ArgFormatter >&, char const*&, UserDefinedObject const&) (ostream.h:111)
==11851==    by 0x41EA93: void fmt::internal::MakeValue<fmt::BasicFormatter<char, fmt::ArgFormatter > >::format_custom_arg(void*, void const*, void*) (format.h:1537)
==11851==    by 0x41B256: fmt::BasicArgFormatter<fmt::ArgFormatter, char>::visit_custom(fmt::internal::Value::CustomValue) (format.h:2550)
==11851==    by 0x4182C7: fmt::ArgVisitor<fmt::ArgFormatter, void>::visit(fmt::internal::Arg const&) (format.h:1993)
==11851==    by 0x414875: fmt::BasicFormatter<char, fmt::ArgFormatter >::format(char const*&, fmt::internal::Arg const&) (format.h:4517)
==11851==    by 0x4104A9: fmt::BasicFormatter<char, fmt::ArgFormatter >::format(fmt::BasicCStringRef) (format.h:4541)
==11851==    by 0x40CDBE: fmt::BasicWriter::write(fmt::BasicCStringRef, fmt::ArgList) (format.h:3104)
==11851==    by 0x417CDA: void fmt::BasicWriter::write(fmt::BasicCStringRef, UserDefinedObject const&) (format.h:3106)
==11851==    by 0x413696: void spdlog::logger::log(spdlog::level::level_enum, char const*, UserDefinedObject const&) (logger_impl.h:69)
==11851==    by 0x410014: void spdlog::logger::info(char const*, UserDefinedObject const&) (logger_impl.h:139)
==11851== 
==11851== This conflicts with a previous write of size 8 by thread #2
==11851== Locks held: none
==11851==    at 0x4EF3B1F: do_widen (locale_facets.h:1107)
==11851==    by 0x4EF3B1F: std::ctype::_M_widen_init() const (ctype.cc:94)
==11851==    by 0x4F501B7: widen (locale_facets.h:876)
==11851==    by 0x4F501B7: widen (basic_ios.h:450)
==11851==    by 0x4F501B7: fill (basic_ios.h:374)
==11851==    by 0x4F501B7: std::ostream& std::ostream::_M_insert(long) (ostream.tcc:73)
==11851==    by 0x40CAD3: operator<<(std::ostream&, UserDefinedObject const&) (42.cpp:14)
==11851==    by 0x422A33: void fmt::format_arg<char, fmt::ArgFormatter, UserDefinedObject>(fmt::BasicFormatter<char, fmt::ArgFormatter >&, char const*&, UserDefinedObject const&) (ostream.h:111)
==11851==    by 0x41EA93: void fmt::internal::MakeValue<fmt::BasicFormatter<char, fmt::ArgFormatter > >::format_custom_arg(void*, void const*, void*) (format.h:1537)
==11851==    by 0x41B256: fmt::BasicArgFormatter<fmt::ArgFormatter, char>::visit_custom(fmt::internal::Value::CustomValue) (format.h:2550)
==11851==    by 0x4182C7: fmt::ArgVisitor<fmt::ArgFormatter, void>::visit(fmt::internal::Arg const&) (format.h:1993)
==11851==    by 0x414875: fmt::BasicFormatter<char, fmt::ArgFormatter >::format(char const*&, fmt::internal::Arg const&) (format.h:4517)
==11851==  Address 0x51c30b9 is 89 bytes inside data symbol "_ZN12_GLOBAL__N_17ctype_cE"
==11851== 
==11851== ----------------------------------------------------------------

如上所示,即使两个完全不同的ostream的operator << 也有可能发生data race,这就很尴尬了。这是因为在operator<<中调用了locale的widen方法,widen方法是怎么实现的呢?下面是libstdc++ 6.2.1 bits/locale_facet.h的实现:


      char_type
      widen(char __c) const
      {
    if (_M_widen_ok) // _M_widen_ok: char
      return _M_widen[static_cast(__c)];
    this->_M_widen_init();
    return this->do_widen(__c);
      }

      virtual const char*
      do_widen(const char* __lo, const char* __hi, char_type* __to) const
      {
    __builtin_memcpy(__to, __lo, __hi - __lo);
    return __hi;
      }

很明显这里没有任何同步措施。这就造成了spdlog坑居多……

正确的做法是什么呢?

待续。。(翻了一圈GTest好像也没有考虑这点……而且也不像是libstdc++的bug……