pyspark.BarrierTaskContext#
- class pyspark.BarrierTaskContext[source]#
- A - TaskContextwith extra contextual info and tooling for tasks in a barrier stage. Use- BarrierTaskContext.get()to obtain the barrier context for a running barrier task.- New in version 2.4.0. - Notes - This API is experimental - Examples - Set a barrier, and execute it with RDD. - >>> from pyspark import BarrierTaskContext >>> def block_and_do_something(itr): ... taskcontext = BarrierTaskContext.get() ... # Do something. ... ... # Wait until all tasks finished. ... taskcontext.barrier() ... ... return itr ... >>> rdd = spark.sparkContext.parallelize([1]) >>> rdd.barrier().mapPartitions(block_and_do_something).collect() [1] - Methods - allGather([message])- This function blocks until all tasks in the same stage have reached this routine. - How many times this task has been attempted. - barrier()- Sets a global barrier and waits until all tasks in this stage hit this barrier. - cpus()- CPUs allocated to the task. - get()- Return the currently active - BarrierTaskContext.- getLocalProperty(key)- Get a local property set upstream in the driver, or None if it is missing. - Returns - BarrierTaskInfofor all tasks in this barrier stage, ordered by partition ID.- The ID of the RDD partition that is computed by this task. - Resources allocated to the task. - stageId()- The ID of the stage that this task belong to. - An ID that is unique to this task attempt (within the same - SparkContext, no two task attempts will share the same attempt ID).