• spark 集群启动出现 Unrecognized Hadoop major version number: 3.0.0-cdh6.3.2


    我是cdh集群,如果你不是cdh6.3以上,你可以走了

    这是我的pom,我现在在本地往hive写数据,没问题,但是,如果打成jar包,放到服务器上,执行就有问题

    1. <project xmlns="http://maven.apache.org/POM/4.0.0"
    2. xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    3. xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    4. <modelVersion>4.0.0modelVersion>
    5. <properties>
    6. <project.build.sourceEncoding>UTF-8project.build.sourceEncoding>
    7. <project.reporting.outputEncoding>UTF-8project.reporting.outputEncoding>
    8. <maven.compiler.source>1.8maven.compiler.source>
    9. <maven.compiler.target>1.8maven.compiler.target>
    10. <scala.version>2.11.12scala.version>
    11. <spark.version>2.4.0spark.version>
    12. <java.version>1.8java.version>
    13. properties>
    14. <groupId>org.examplegroupId>
    15. <artifactId>DataPrepareartifactId>
    16. <version>1.0-SNAPSHOTversion>
    17. <dependencies>
    18. <dependency>
    19. <groupId>org.scala-langgroupId>
    20. <artifactId>scala-libraryartifactId>
    21. <version>${scala.version}version>
    22. dependency>
    23. <dependency>
    24. <groupId>org.apache.sparkgroupId>
    25. <artifactId>spark-core_2.11artifactId>
    26. <version>${spark.version}version>
    27. dependency>
    28. <dependency>
    29. <groupId>org.apache.sparkgroupId>
    30. <artifactId>spark-hive_2.11artifactId>
    31. <version>${spark.version}version>
    32. dependency>
    33. <dependency>
    34. <groupId>org.apache.sparkgroupId>
    35. <artifactId>spark-sql_2.11artifactId>
    36. <version>${spark.version}version>
    37. dependency>
    38. dependencies>
    39. <build>
    40. <finalName>WordCountfinalName>
    41. <plugins>
    42. <plugin>
    43. <groupId>net.alchim31.mavengroupId>
    44. <artifactId>scala-maven-pluginartifactId>
    45. <version>3.2.2version>
    46. <executions>
    47. <execution>
    48. <goals>
    49. <goal>compilegoal>
    50. <goal>testCompilegoal>
    51. goals>
    52. execution>
    53. executions>
    54. plugin>
    55. <plugin>
    56. <groupId>org.apache.maven.pluginsgroupId>
    57. <artifactId>maven-assembly-pluginartifactId>
    58. <version>3.0.0version>
    59. <configuration>
    60. <archive>
    61. <manifest>
    62. <mainClass>WordCountmainClass>
    63. manifest>
    64. archive>
    65. <descriptorRefs>
    66. <descriptorRef>jar-with-dependenciesdescriptorRef>
    67. descriptorRefs>
    68. configuration>
    69. <executions>
    70. <execution>
    71. <id>make-assemblyid>
    72. <phase>packagephase>
    73. <goals>
    74. <goal>singlegoal>
    75. goals>
    76. execution>
    77. executions>
    78. plugin>
    79. plugins>
    80. build>
    81. project>

     

    后来了解到,只有在用代码集成hive-jdbc,spark-core,spark-sql,spark-hive等依赖的时候,就会报出这么一个问题。

    spark-hive.jar包和spark2.4有冲突,这个是一个社区问题,不是你的问题。

    本地,idea运行时,需要有这个spark-hive,但是你上传到集群交给,yarn去执行每台机器的spark就必须把他注释掉

    解决方案:丢到集群的时候,注释掉,在进行打包