许多 C++ 权威,或者甚至是计算机科学的权威,都把并行,或者在微观的层面上,多线程,看作下一次革命的主题。
很久没有关心这些事情了,今天读了一些相关文章,很有启发。首先是 C++0x 的一篇 proposal N1680: Memory model for multithreaded C++ 。
A memory model describes the behavior of threads with respect to basic memory operations – mainly reads and writes of variables potentially accessible across multiple threads. The main questions addressed by a memory model include:
Atomicity: Which memory operations have indivisible effects?
Visibility: Under what conditions will the effects of a write action by one thread be seen by a read by another thread?
Ordering: Under what conditions are sequences of memory operations by one or more threads guaranteed to be visible in the same order by other threads?
除了定义这些,主要还谈到由于 Java Memory Model 在这方面相当成熟,未来可能主要会 "Adopt" Java 的模型。
在 N1876 里面,有些东西比较具体化了,比如 Atomicity :
For example, atomically updated integers would provide operations such as load Atomically read the value of the integer.
store Atomically replace the value of the integer.
fetch and add Atomically add a value to the integer.
store with release ordering semantics Atomically replace the value of the integer, and ensure that all prior memory operations executed by this thread become visible to other threads before the update.
load with acquire ordering semantics Atomically load the value of the integer, and ensure that all later memory operations performed by this thread become visible after the load.
其实后面两个在语义上与 locking 有一定的相似之处,但是使用 atomicity, visibility 和 ordering 这样的方式来定义它们给人以 lock-free 的暗示。而且,它们的语义比起 locking 来,还是要弱一些。例如第三个,它只说明内存操作要 visible to other threads ,而 locking 的意思是在同一时间根本不允许其他线程的访问。的确,lock free 算法在性能上的好处是很大的,而且没有 locking 带来的一系列问题。在 HP 网站上,Boehm 给出了一个非常粗略的模型:
enum ordering_constraint {none, acquire, release, ordered};
template <class T>
class atomic {
public:
static bool basics_supported();
template <ordering_constraint c>
void store(const T&);
template <ordering_constraint c>
T load();
static bool cas_supported();
template <ordering_constraint c>
bool cas(const T& old, const T& new_val);
};
template <class T = emulated_atomic<int> >
class atomic_int : public T {
public:
template <ordering_constraint c>
T fetch_and_add(const T&);
template <ordering_constraint c>
T fetch_and_add1();
template <ordering_constraint c>
T fetch_and_sub1();
template <ordering_constraint c>
T fetch_and_and(const T&);
template <ordering_constraint c>
T fetch_and_or(const T&);
};
相信对于程序员来说,这些代码比长篇大论来得更清楚。通过 atomic 可以建立起具有原子性操作语义的数据结构以及对于 CAS 的直接支持,由于 lock-free 算法很难写,使用 lock-free 的数据结构可能是个更好的选择。