Ali MaxCompute SDK

SDK

ALI MC 文件读写

    public abstract BufferedInputStream readResourceFileAsStream(String var1) throws IOException;

1
2

LocalExecutionContext.java

  @Override
  public BufferedInputStream readResourceFileAsStream(String resourceName) throws IOException {
    try {
      return wareHouse.readResourceFileAsStream(wareHouse.getOdps().getDefaultProject(),
                                                resourceName, ',');
    } catch (OdpsException e) {
      throw new IOException(e.getMessage());
    }

  }
1
2
3
4
5
6
7
8
9
10

WareHouse.java

  public BufferedInputStream readResourceFileAsStream(String project, String resource,
                                                      char inputColumnSeperator)
      throws IOException, OdpsException {

    if (!existsResource(project, resource)) {
      DownloadUtils.downloadResource(WareHouse.getInstance().getOdps(), getOdps()
          .getDefaultProject(), resource, getLimitDownloadRecordCount(), inputColumnSeperator);
    }

    if (!existsResource(project, resource)) {
      throw new OdpsException("File Resource " + project + "." + resource + " not exists");
    }

    File file = getReourceFile(project, resource);
    if (!file.isFile()) {
      throw new OdpsException("Resource " + project + "." + resource
                              + " is not a valid file Resource, because it is a direcotry");
    }
    return new BufferedInputStream(new FileInputStream(file));
  }

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

JDK BufferedInputStream

/**
 * A BufferedInputStream adds
 * functionality to another input stream-namely,
 * the ability to buffer the input and to
 * support the mark and reset
 * methods. When  the BufferedInputStream
 * is created, an internal buffer array is
 * created. As bytes  from the stream are read
 * or skipped, the internal buffer is refilled
 * as necessary  from the contained input stream,
 * many bytes at a time. The mark
 * operation  remembers a point in the input
 * stream and the reset operation
 * causes all the  bytes read since the most
 * recent mark operation to be
 * reread before new bytes are  taken from
 * the contained input stream.
 *
 * @author  Arthur van Hoff
 * @since   JDK1.0
 */
public
class BufferedInputStream extends FilterInputStream {

    private static int DEFAULT_BUFFER_SIZE = 8192;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

MC SQL

explain EXTENDED select * from person LATERAL VIEW EXPLODE(ARRAY(30, 60)) name1 AS c_age left join person1 on person.id = person1.id;

Job Queueing...
job0 is root job
In Job job0:
root Tasks: M1
M2_1 depends on: M1
In Task M1:
    Data source: hongta_test_mc.person1
    TS: hongta_test_mc.person1
        Statistics: Num rows: 4.0, Data size: 848.0
        FIL: ISNOTNULL(id)
            Statistics: Num rows: 3.6, Data size: 763.2
            RS: valueDestLimit: 0
                dist: BROADCAST
                keys:
                values:
                      id (int)
                      name (string)
                      age (int)
                      class (int)
                      address (string)
                partitions:
                Statistics: Num rows: 3.6, Data size: 763.2
In Task M2_1:
    Data source: hongta_test_mc.person
    TS: hongta_test_mc.person
        Statistics: Num rows: 4.0, Data size: 848.0
        SEL: id, name, age, class, address, [30,60] __tvf_arg_0
            Statistics: Num rows: 4.0, Data size: 1008.0
            TVF: EXPLODE(__tvf_arg_0) (c_age)
                Statistics: Num rows: 20.0, Data size: 4320.0
                HASHJOIN:
                         TableFunctionScan1 LEFTJOIN StreamLineRead1
                         keys:
                             0:id
                             1:id
                         non-equals:
                             0:
                             1:
                         bigTable: TableFunctionScan1
                    Statistics: Num rows: 36.0, Data size: 15408.0
                    FS: output: Screen
                        schema:
                          id (int)
                          name (string)
                          age (int)
                          class (int)
                          address (string)
                          c_age (int)
                          id (int) AS id2
                          name (string) AS name2
                          age (int) AS age2
                          class (int) AS class2
                          address (string) AS address2
                        Statistics: Num rows: 36.0, Data size: 15408.0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54

explain select * from person left join person1 on person.id = person1.id LATERAL VIEW EXPLODE(ARRAY(30, 60)) name1 AS c_age;

Job Queueing...
job0 is root job
In Job job0:
root Tasks: M1
M2_1 depends on: M1
In Task M1:
    Data source: hongta_test_mc.person1
    TS: hongta_test_mc.person1
        Statistics: Num rows: 4.0, Data size: 848.0
        FIL: ISNOTNULL(id)
            Statistics: Num rows: 3.6, Data size: 763.2
            RS: valueDestLimit: 0
                dist: BROADCAST
                keys:
                values:
                      id (int)
                      name (string)
                      age (int)
                      class (int)
                      address (string)
                partitions:
                Statistics: Num rows: 3.6, Data size: 763.2
In Task M2_1:
    Data source: hongta_test_mc.person
    TS: hongta_test_mc.person
        Statistics: Num rows: 4.0, Data size: 848.0
        HASHJOIN:
                 TableScan1 LEFTJOIN StreamLineRead1
                 keys:
                     0:id
                     1:id
                 non-equals:
                     0:
                     1:
                 bigTable: TableScan1
            Statistics: Num rows: 7.2, Data size: 3052.8
            SEL: id, name, age, class, address, id id5, name name6, age age7, class class8, address address9, [30,60] __tvf_arg_0
                Statistics: Num rows: 7.2, Data size: 3340.8
                TVF: EXPLODE(__tvf_arg_0) (c_age)
                    Statistics: Num rows: 36.0, Data size: 15408.0
                    FS: output: Screen
                        schema:
                          id (int)
                          name (string)
                          age (int)
                          class (int)
                          address (string)
                          id5 (int) AS id2
                          name6 (string) AS name2
                          age7 (int) AS age2
                          class8 (int) AS class2
                          address9 (string) AS address2
                          c_age (int)
                        Statistics: Num rows: 36.0, Data size: 15408.0
OK
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55

相关阅读:
java计算机毕业设计基于springboo+vue的旅游自驾游攻略方案分享系统
 app查看证书公钥和md5
图论篇--代码随想录算法训练营第五十八天打卡|拓扑排序，dijkstra（朴素版），dijkstra（堆优化版）精讲
 【C++】map和set——树形结构的关联式容器
 sleep()方法和wait()方法的异同点
 vue3中路由hash与History的设置
 《管理学原理》题库(4套）
Encoder——Decoder工作原理与代码支撑
 Vue的模块与模块化、组件与组件化
 Js与Jq实战：第十讲：jQuery制作动画
原文地址：https://blog.csdn.net/zhixingheyi_tian/article/details/133773653

Ali MaxCompute SDK

相关links

SDK

ALI MC 文件读写

JDK BufferedInputStream

MC SQL