报错,不想看原因的直接去解决方案试试
Exception in thread "main" java.lang.IllegalArgumentException: Pathname /C:/Users/Administrator/AppData/Local/Temp/1/temporary-611514af-8dc5-4b20-9237-e5f2d21fdf88/metadata from hdfs://master:8020/C:/Users/Administrator/AppData/Local/Temp/1/temporary-611514af-8dc5-4b20-9237-e5f2d21fdf88/metadata is not a valid DFS filename.
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:197)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:106)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1426)
at org.apache.spark.sql.execution.streaming.StreamMetadata$.read(StreamMetadata.scala:51)
at org.apache.spark.sql.execution.streaming.StreamExecution.<init>(StreamExecution.scala:122)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.<init>(MicroBatchExecution.scala:49)
at org.apache.spark.sql.streaming.StreamingQueryManager.createQuery(StreamingQueryManager.scala:258)
at org.apache.spark.sql.streaming.StreamingQueryManager.startQuery(StreamingQueryManager.scala:299)
at org.apache.spark.sql.streaming.DataStreamWriter.start(DataStreamWriter.scala:296)
at com.gugu.book.basespark.chapter08.StructuredNetworkWordCount$.main(StructuredNetworkWordCount.scala:25)
at com.gugu.book.basespark.chapter08.StructuredNetworkWordCount.main(StructuredNetworkWordCount.scala)
看报错信息好像是要赋权,路径不存在,但是这个路径有点古怪
hdfs://master:8020/C:/Users/Administrator/AppData/Local/Temp/1/temporary-611514af-8dc5-4b20-9237-e5f2d21fdf88/metadata is not a valid DFS filename
按道理不应该出现C:/。。。。这个是我本地的一个地址,但是却拼接到了hdfs文件系统下,应该是什么配置的有问题
顺着报错信息
只需要checkpointRoot修改就行这个地址来源于checkpointLocation
org.apache.spark.sql.streaming.StreamingQueryManager#createQuery
也就是说直接改变“checkpointLocation”的值就行
在start之前设置
.option("checkpointLocation", "file:///D:\\applicationfiles\\data\\kafka")
解决,完美
环境问题,在windows下有些默认位置拼接到了hdfs上