Skip to content

1.22.0

And when it is unavaoidable, attempt to handle the possibility of
C++11's per_thread_waiter destructor being called multiple times
on the same variable.

Previously, if a C++11 build of nsync was requested, we would use
C++11 thread_local to achieve the effects of
pthread_key_create()'s destructor argument; that is, to invoke
cleanup when a thread exits.

Unfortunately,
a) C++11 specifies that the destructors of static thread_local
   variables for the thread calling exit() are called before other static
   destructors are called.  See
   https://en.cppreference.com/w/cpp/utility/program/exit
   [Strangely, it doesn't destruct all variables, which makes you wonder why it
   bothers to destruct any, given that destructing any is dangerous.]

b) nsync can be pulled into an address space multiple times via
   multiple shared libraries into which nsync has been linked.

Even (a) alone is dangerous: imagine that a programmer uses a thread_local
variable from another static variable's destructor.

When coupled with (b), various implementation behaviours become conceivable.
For example, it's possible that an implementation could cause the same
thread_local variable to be desructed repeatedly, once by each copy of the
code pulled into the address space.

This change does several things.  The first two are related directly to the
issue described above.  The remainder were found during testing.

1) Uses pthread_key_create/pthread_getspecific/pthread_setspecific
   whenever possible, to avoid the use of thread_local.
   That's the version in nsync/platform/posix/src/per_thread_waiter.c.
   On Windows, we use the code that simulates pthread_key_create etc.

2) If the C++11 version is used, it will try to defend itself
   against the destructor being called twice, assuming that's
   a potential issue.  That's the version in
   nsync/platform/c++11/src/per_thread_waiter.cc

3) It fixes a bug in platform/win32/src/pthread_key_win32.cc.
   Previously the code could try to run a null destructor.  This
   could never happen in practice, because the library never sets
   a null destructor, but it's good to fix.

4) It casts the pointer passed to Windows'
   InterlockedCompareExchange().  Different compilers on Windows
   have different ideas about type compatibility with uint32_t.

See
  https://github.com/tensorflow/tensorflow/issues/31301
Assets 2
Loading