The Scala Book:https://docs.scala-lang.org/overviews/scala-book/
Scala Basics:https://docs.scala-lang.org/tour/basics.html
take索引val s =scala.collection.mutable.LinkedHashSet[Tuple2[String,String]]()
var x = 2 // 变量
x = x + 1 // correct
val x = 2 // 常量
x = x + 1 // wrong!
println({
val x = 1 + 1
x + 1 // the last expression is the result of the whole block
}) // 3
val addOne = (x: Int) => x + 1
println(addOne(1)) // 2
参数列表可以为空:
val getTheAnswer = () => 42
println(getTheAnswer()) // 42
def whileLoop(condition: => Boolean)(body: => Unit): Unit =
{}
def 函数名(参数:参数类型,…):
最后一行是返回值
def add(x:Int,y:Int):
Int = x+y
包含自己的方法,可以被继承和实现,但是不能被初始化。
trait Iterator[A] { def hasNext: Boolean def next(): A }
Extending the trait Iterator[A] requires a type A and implementations of the methods hasNext and next.
class IntIterator(to: Int) extends Iterator[Int]{
private var t=0
override def hasNext:
Boolean = current < to
override def next(): Int ={t}
}
class User
val user1 = new User // 实例化了一个类
示例代码:
//class
class User
val user = new User
class Point(var x:Int, var y:Int){ // x 和y是Point class的成员变量
def move(dx:Int, dy:Int):Unit = {
x = x + dx
y= y + dy
}
override def toString: String =
s"($x, $y)"
}
val point1 = new Point(2, 3)
println(point1.x) // prints 2
println(point1)
注意:Java和Scala混用时,constructor的默认的参数值不管用。
和Python的元组相似:
val ingredient = ("Sugar", 25)
println(ingredient._1) // Sugar
println(ingredient._2) // 25
val numbers = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
val res = numbers.foldLeft(0)((m, n) => m + n)
println(res) // 55
传入的第一个参数是初始值,第二个参数是一个函数,用于定义初始值和List中每个元素的运算
rideId : Long // a unique id for each ride
taxiId : Long // a unique id for each taxi
driverId : Long // a unique id for each driver
isStart : Boolean // TRUE for ride start events, FALSE for ride end events
eventTime : Instant // the timestamp for this event
startLon : Float // the longitude of the ride start location
startLat : Float // the latitude of the ride start location
endLon : Float // the longitude of the ride end location
endLat : Float // the latitude of the ride end location
passengerCnt : Short // number of passengers on the ride
fare event
rideId : Long // a unique id for each ride
taxiId : Long // a unique id for each taxi
driverId : Long // a unique id for each driver
startTime : Instant // the start time for this ride
paymentType : String // CASH or CARD
tip : Float // tip for this ride
tolls : Float // tolls for this ride
totalFare : Float // total fare collected

2.下载Flink Tutorial:
https://github.com/apache/flink-training/tree/release-1.15/
在Win界面下可能会报错【# \r‘:command not found】

修改换行方式:
https://blog.csdn.net/fangye945a/article/details/120660824

3. 完善Exercise.scala文件,运行test文件,即可看到结果。
Scala中也可以调用Java的函数:https://docs.scala-lang.org/scala3/book/interacting-with-java.html
教程链接: https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/learn-flink/overview/
流可以从消息队列或分布式日志系统中读入,例如:Apache Kafka or Kinesis.但是Flink也可以读入bounded的数据来源。输出同理。

Flink程序内在本身就是并行且分布式的。
每一个数据流都有多个stream partition.
每一个operator都有多个operator subtasks,不同操作符的并行级别不一样.

One-to-one streams:例如source和map
Redistributing streams:
实时流:可以通过在数据中加入时间戳
Flink’s operations can be stateful. This means that how one event is handled can depend on the accumulated effect of all the events that came before it.
A Flink application is run in parallel on a distributed cluster. The various parallel instances of a given operator will execute independently,in separate threads, and in general will be running on different machines.

The 3rd operator is stateful.
A fully-connected network shuffle is occurring between the second and third operators.
This is being done to partition the stream by some key, so that all of the events that need to be processed together, will be.
教程链接:https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/learn-flink/datastream_api/
StreamExecutionEnvironment.env.execute() is called the job graph is packaged up and sent to the JobManager, which parallelizes the job and distributes slices of it to the Task Managers for execution.
1.env.fromCollections:从列表创建
List people = new ArrayList();
people.add(new Person("Fred", 35));
people.add(new Person("Wilma", 35));
people.add(new Person("Pebbles", 2));
DataStream flintstones = env.fromCollection(people);
2.env.socketTextStream/readTextfile 从远程/文件读取
DataStream lines = env.socketTextStream("localhost", 9999);
DataStream lines = env.readTextFile("file:///path");
小结:流主要通过env中的函数读取。
Streams could also be debugged by inserting local breakpoints,etc.
教程链接:https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/learn-flink/etl/
Flink’s table API:https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/table/overview/
map(): only suitable for one-to-one corespondence (全射)
flatmap(): otherwise cases