spark向hadoop写入文件后,
使用hadoop查路径为目录,无法查值
使用mapreduce时也有类似现象
-----------------------------------------------
scala:
result.saveAsTextFile("hdfs://127.0.0.1:9000/spark/t1") //
hadoop:
FSDataInputStream fsDataInputStream = fs.open(new Path("hdfs://127.0.0.1:9000/spark/t1"));
-------------------------------------------------------
spark实际就是把 /spark/t1 作为目录的,实际的数据文件是在更下一层
用一下代码可以查看路径
FileStatus[] files = fs.listStatus(new Path("/spark/t1")); for (FileStatus f : files) System.out.println(f.getPath());
---------------------------------------------------------
最终查出如下
hdfs://127.0.0.1:9000/spark/t1/_SUCCESS
HdfsLocatedFileStatus{path=hdfs://127.0.0.1:9000/spark/t1/_SUCCESS; isDirectory=false; length=0; replication=3; blocksize=134217728; modification_time=1656568339191; access_time=1656568339180; owner=JavaDev; group=supergroup; permission=rw-r--r--; isSymlink=false; hasAcl=false; isEncrypted=false; isErasureCoded=false}
isDirectory:false
(hyd,1)
hdfs://127.0.0.1:9000/spark/t1/part-00000
HdfsLocatedFileStatus{path=hdfs://127.0.0.1:9000/spark/t1/part-00000; isDirectory=false; length=7; replication=3; blocksize=134217728; modification_time=1656568339046; access_time=1656568338725; owner=JavaDev; group=supergroup; permission=rw-r--r--; isSymlink=false; hasAcl=false; isEncrypted=false; isErasureCoded=false}
isDirectory:false
(hyd,1)
hdfs://127.0.0.1:9000/spark/t1/part-00001
HdfsLocatedFileStatus{path=hdfs://127.0.0.1:9000/spark/t1/part-00001; isDirectory=false; length=8; replication=3; blocksize=134217728; modification_time=1656568339049; access_time=1656568338726; owner=JavaDev; group=supergroup; permission=rw-r--r--; isSymlink=false; hasAcl=false; isEncrypted=false; isErasureCoded=false}
isDirectory:false
(hyd,1)