温馨提示:本文翻译自stackoverflow.com,查看原文请点击:其他 - Are ClosedByInterruptException exceptions expected when spark speculation kills tasks?
apache-spark

其他 - 当 Spark 推测杀死任务时,是否会发生ClosedByInterruptException异常?

发布于 2020-04-07 10:39:21

我正在研究在Spark结构化流应用程序上启用Spark投机。当推测杀死了任务时, Spark 会记录许多ClosedByInterruptException异常。这些异常大多数来自org.apache.spark.storage.DiskBlockObjectWriter.revertPartialWritesAndClose方法内部。

这些例外是否可以忽略?关闭投机功能后看不到这些异常。我正在使用Spark 2.4.3。

示例异常:

2019-07-02 03:38:07,195 [Executor task launch worker for task 667] ERROR org.apache.spark.storage.DiskBlockObjectWriter - Uncaught exception while reverting partial writes to file /data/vol/nodemanager/usercache/spark_user/appcache/application_1556810695108_1045638/blockmgr-54340b28-723b-46a3-b58a-c8598d75e4a2/3f/temp_shuffle_763d619b-26c8-4a0d-bc99-6d4661b42eba
java.nio.channels.ClosedByInterruptException
    at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
    at sun.nio.ch.FileChannelImpl.truncate(FileChannelImpl.java:370)
    at org.apache.spark.storage.DiskBlockObjectWriter$$anonfun$revertPartialWritesAndClose$2.apply$mcV$sp(DiskBlockObjectWriter.scala:218)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1369)
    at org.apache.spark.storage.DiskBlockObjectWriter.revertPartialWritesAndClose(DiskBlockObjectWriter.scala:214)
    at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.stop(BypassMergeSortShuffleWriter.java:237)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:105)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
    at org.apache.spark.scheduler.Task.run(Task.scala:121)
    at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

查看更多

提问者
arunpandianp
被浏览
148
cringineer 2020-01-31 21:45

我想这就是答案。此问题已在spark 3.0中修复

https://issues.apache.org/jira/browse/SPARK-28340