• spark算子基础


    创建数组
    scala> val a = sc.parallelize(Array(("zhangsan", 99, 98, 100),("lisi", 99, 98, 100),("wangwu", 99, 98, 100)))
    a: org.apache.spark.rdd.RDD[(String, Int, Int, Int)] = ParallelCollectionRDD[28] at parallelize at :24

    要求:筛选出名字为zhang开头的总分最高分
    一、
    scala> a.filter(x=>x._1.startsWith("zhang")).map(x=>(x._1,x._2+x._3+x._4)).sortBy(x=> - x._2).take(1)
    res30: Array[(String, Int)] = Array((zhangsan,297))

    二、
    scala> a.filter(x=>x._1.startsWith("zhang")).map(x=>(x._1,x._2+x._3+x._4)).reduceByKey((x,y) => if (x>y) x else y)
    res28: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[51] at reduceByKey at :26

    三、
    scala> a.filter(x=>x._1.startsWith("zhang")).map(x=>(x._2+x._3+x._4,x._1)).max
    res31: (Int, String) = (297,zhangsan)

    四、
    scala> a.filter(x=>x._1.startsWith("zhang")).map(x=>("zhang",(x._1,x._2+x._3+x._4))).reduceByKey((x,y) => if (x._2>y._2) x else y).collect
    res34: Array[(String, (String, Int))] = Array((zhang,(zhangsan,297)))
    scala> a.filter(x=>x._1.startsWith("zhang")).map(x=>("zhang",(x._1,x._2+x._3+x._4))).reduceByKey((x,y) => if (x._2>y._2) x else y).map(x=>x._2).collect
    res35: Array[(String, Int)] = Array((zhangsan,297))

    五、
    scala> a.filter(x => x._1.startsWith("zhang")).map(x => (x._1, x._2 + x._3 + x._4)).
         |   groupBy(x => x._1.substring(0, 5)).map(x => {
         |     var name = ""
         |     var sumscore = 0
         |     val itor = x._2.iterator
         |     for (e <- itor) {
         |       if (e._2 > sumscore) {
         |         name = e._1;
         |         sumscore = e._2
         |       }
         |     }
         |     (name, sumscore)
         |   }).collect()
    res3: Array[(String, Int)] = Array((zhangsan,297))

  • 相关阅读:
    【练习题】一.线性表
    Python语言:经典例题分析讲解
    Tomcat部署及优化
    Dubbo的集群容错方案
    系统架构设计师-第12章-信息系绍酣忽如何里论与实践-软考-学习笔记
    P2 Pytorch 张量数据类型
    ElasticSearch- Mapping
    2023最新SSM计算机毕业设计选题大全(附源码+LW)之java高校饭堂管理系统8gmjo
    由遍历序列构造二叉树--王道
    开源监控工具monit安装部署
  • 原文地址:https://blog.csdn.net/jmz98/article/details/133705409