读者必须明白flink 任务的划分是怎么来的,如果不明白建议去查看相关资料,否则这篇文章对你而言没有任何意义。
我觉得还是英文说的清楚,就不翻译了。
The default parallelism to use for programs that have no parallelism specified. (DEFAULT: 1). For setups that have no concurrent jobs running, setting this value to NumTaskManagers * NumSlotsPerTaskManager will cause the system to use all available execution resources for the program’s execution. Note: The default parallelism can be overwriten for an entire job by calling setParallelism(int parallelism) on the ExecutionEnvironment or by passing -p to the Flink Command-line frontend. It can be overwritten for single transformations by calling setParallelism(int parallelism) on an operator. See Parallel Execution for more information about parallelism.
parallelism其实就是并行度,并发执行的线程数量
:The number of parallel operator or user function instances that a single TaskManager can run (DEFAULT: 1). If this value is larger than 1, a single TaskManager takes multiple instances of a function or operator. That way, the TaskManager can utilize multiple CPU cores, but at the same time, the available memory is divided between the different operator or function instances. This value is typically proportional to the number of physical CPU cores that the TaskManager’s machine has (e.g., equal to the number of cores, or half the number of cores). More about task slots.
值得是当前taskManager进程管理的线程的数量,也可理解成核数。一个taskManager可能会运行多个task,每个task都是最细粒度的执行单元。
parallelism数量/taskmanager.numberOfTaskSlots