• hive on spark 代码方式实现


    CDH 版本 6.3

        spark 版本  2.4.0-cdh6.3.2
        hadoop 版本 3.0.0-cdh6.3.2
        hive 版本 2.1.1-cdh6.3.2
    
    • 1
    • 2
    • 3

    pom文件

    <?xml version="1.0" encoding="UTF-8"?>
    <project xmlns="http://maven.apache.org/POM/4.0.0"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
        <modelVersion>4.0.0</modelVersion>
    
        <groupId>bigdata</groupId>
        <artifactId>data_shuffle_demo</artifactId>
        <version>1.0-SNAPSHOT</version>
        <properties>
    
            <scala.version>2.11.12</scala.version>
            <spark.version>2.4.0-cdh6.3.2</spark.version>
            <hadoop.version>3.0.0-cdh6.3.2</hadoop.version>
            <!--项目打包: provided  项目本地运行调试: compile-->
            <mvn.scop.pro>provided</mvn.scop.pro>
            <!-- 加了provided的依赖要保证在spark服务器的sparkJar目录里存在, 具体路径/opt/cloudera/parcels/CDH/lib/spark/jars -->
            <!--jar包前缀-->
            <jar.pre.name>optimization</jar.pre.name>
        </properties>
    
        <repositories>
    
            <repository>
    
                <id>cloudera</id>
                <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
                <releases>
                    <enabled>true</enabled>
                </releases>
                <snapshots>
                    <enabled>false</enabled>
                </snapshots>
            </repository>
    
        </repositories>
    
            <dependencies>
            <dependency>
                <groupId>org.scala-lang</groupId>
                <artifactId>scala-library</artifactId>
                <version>${scala.version}</version>
                <scope>${mvn.scop.pro}</scope>
            </dependency>
            <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-client</artifactId>
                <version>${hadoop.version}</version>
                <scope>${mvn.scop.pro}</scope>
            </dependency>
            <dependency>
                <groupId>org.apache.spark</groupId>
                <artifactId>spark-core_2.11</artifactId>
                <version>${spark.version}</version>
                <exclusions>
                    <exclusion>
                        <artifactId>hadoop-client</artifactId>
                        <groupId>org.apache.hadoop</groupId>
                    </exclusion>
                </exclusions>
                <scope>${mvn.scop.pro}</scope>
            </dependency>
            <dependency>
                <groupId>org.apache.spark</groupId>
                <artifactId>spark-sql_2.11</artifactId>
                <version>${spark.version}</version>
                <scope>${mvn.scop.pro}</scope>
            </dependency>
    
    
            <dependency>
                <groupId>org.apache.hive</groupId>
                <artifactId>hive-exec</artifactId>
                <version>2.1.1-cdh6.3.2</version>
            </dependency>
    
            <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-jdbc</artifactId>
            <version>2.1.1-cdh6.3.2</version>
            <scope>${mvn.scop.pro}</scope>
            </dependency>
    
            <dependency>
                <groupId>org.apache.spark</groupId>
                <artifactId>spark-hive_2.11</artifactId>
                <version>${spark.version}</version>
                <scope>${mvn.scop.pro}</scope>
            </dependency>
    
        </dependencies>
    
        <build>
            <finalName>${jar.pre.name}</finalName>
            <plugins>
    
    
                <!--设置编译版本-->
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-compiler-plugin</artifactId>
                    <version>3.8.1</version>
                    <configuration>
                        <source>1.8</source>
                        <target>1.8</target>
                    </configuration>
                </plugin>
    
    
                <!--声明并引入构建插件-->
                <plugin>
                    <groupId>net.alchim31.maven</groupId>
                    <artifactId>scala-maven-plugin</artifactId>
                    <version>3.2.2</version>
                    <executions>
                        <execution>
                            <goals>
                                <goal>compile</goal>
                                <goal>testCompile</goal>
                            </goals>
                        </execution>
                    </executions>
                </plugin>
    
    
            </plugins>
        </build>
    
    </project>
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118
    • 119
    • 120
    • 121
    • 122
    • 123
    • 124
    • 125
    • 126
    • 127
    • 128
    • 129

    sparksession 连接hive

    //spark链接
      def sparkSession(): SparkSession = {
    
        var conf = new SparkConf()
          .set("spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation", "true")
          .set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
          .set("spark.sql.warehouse.dir", "your hive warehouse")
          .set("hive.metastore.uris", "your hive metastore uris")
        SparkSession.builder()
          .config(conf)
          .enableHiveSupport()
          .getOrCreate()
      }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
  • 相关阅读:
    MindFusion.Diagramming 6.8.4 Crack
    1312. 让字符串成为回文串的最少插入次数
    博客网页制作基础大二dw作业 web课程设计网页制作 个人网页设计与实现 我的个人博客网页开发
    宝宝蛋白过敏怎么办
    图片操作笔记-滤波-python
    知识图谱实体对齐3:无监督和自监督的方法
    优维科技7周年庆|未来可“7”,从心出发
    LeetCode刷题
    Spring Security 中重要对象汇总
    音视频转换软件Permute mac中文板特点介绍
  • 原文地址:https://blog.csdn.net/IT_liuzhiyuan/article/details/125526942