You want to perform some parallel processing within your dataflow mesh.
您希望在数据流网格中执行一些并行处理。
Solution 解决方案
By default, each dataflow block is independent from each other block. When you link two blocks together, they will process independently. So, every dataflow mesh has some natural parallelism built in.
If you need to go beyond this—for example, if you have one particular block that does heavy CPU computations—then you can instruct that block to operate in parallel on its input data by setting the MaxDegreeOfParallelism option. By default, this option is set to 1, so each dataflow block will only process one piece of data at a time.
BoundedCapacity can be set to DataflowBlockOptions.Unbounded or any value greater than zero. The following example permits any number of tasks to be multiplying data simultaneously:
var multiplyBlock = new TransformBlock<int, int>(
item => item * 2,
new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded
});
var subtractBlock = new TransformBlock<int, int>(item => item - 2);
multiplyBlock.LinkTo(subtractBlock);
Discussion 讨论
The MaxDegreeOfParallelism option makes parallel processing within a block easy to do. What is not so easy is determining which blocks need it. One technique is to pause dataflow execution in the debugger, where you can see the number of data items queued up (i.e., the ones that haven’t yet been processed by the block). An unexpected number of data items can be an indication that some restructuring or parallelization would be helpful.
MaxDegreeOfParallelism also works if the dataflow block does asynchronous processing. In this case, the MaxDegreeOfParallelism option specifies the level of concurrency—a certain number of slots. Each data item takes up a slot when the block begins processing it and only leaves that slot when the asynchronous processing is fully completed.