Concurrency in CSharp Cookbook中文翻译:5.5数据流块并行处理

Problem 问题

You want to perform some parallel processing within your dataflow mesh.
您希望在数据流网格中执行一些并行处理。

Solution 解决方案

By default, each dataflow block is independent from each other block. When you link two blocks together, they will process independently. So, every dataflow mesh has some natural parallelism built in.
缺省情况下,每个数据流块之间是相互独立的。当您将两个块连接在一起时,它们将独立处理。因此,每个数据流网格都有一些自然的并行性。
If you need to go beyond this—for example, if you have one particular block that does heavy CPU computations—then you can instruct that block to operate in parallel on its input data by setting the MaxDegreeOfParallelism option. By default, this option is set to 1, so each dataflow block will only process one piece of data at a time.
如果您需要更进一步—例如,如果您有一个特定的块需要进行大量的CPU计算—那么您可以通过设置MaxDegreeOfParallelism选项来指示该块对其输入数据并行操作。默认情况下,此选项设置为1,因此每个数据流块一次只处理一个数据块。
BoundedCapacity can be set to DataflowBlockOptions.Unbounded or any value greater than zero. The following example permits any number of tasks to be multiplying data simultaneously:
BoundedCapacity可以设置为DataflowBlockOptions。无界或任何大于零的值。下面的例子允许任意数量的任务同时进行数据乘法运算:
var multiplyBlock = new TransformBlock<int, int>(
	item => item * 2,
	new ExecutionDataflowBlockOptions
	{
	    MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded
	});
var subtractBlock = new TransformBlock<int, int>(item => item - 2);
multiplyBlock.LinkTo(subtractBlock);

Discussion 讨论

The MaxDegreeOfParallelism option makes parallel processing within a block easy to do. What is not so easy is determining which blocks need it. One technique is to pause dataflow execution in the debugger, where you can see the number of data items queued up (i.e., the ones that haven’t yet been processed by the block). An unexpected number of data items can be an indication that some restructuring or parallelization would be helpful.
MaxDegreeOfParallelism选项使块内的并行处理变得容易。不那么容易的是确定哪些块需要它。一种技术是在调试器中暂停数据流的执行,在那里您可以看到排队的数据项的数量(即,尚未被块处理的数据项)。意外数量的数据项可能表明一些重构或并行化将有所帮助。
MaxDegreeOfParallelism also works if the dataflow block does asynchronous processing. In this case, the MaxDegreeOfParallelism option specifies the level of concurrency—a certain number of slots. Each data item takes up a slot when the block begins processing it and only leaves that slot when the asynchronous processing is fully completed.
如果数据流块进行异步处理,MaxDegreeOfParallelism也会起作用。在本例中,MaxDegreeOfParallelism选项指定并发级别——一定数量的插槽。每个数据项在块开始处理时占用一个槽,只有在异步处理完全完成时才离开该槽。