创建的时候用 LOCATION 指定原数据存储的路径,不指定的话 hive 会在 /usr/hive/warehouse 下以外部表的表名创建目录并将数据存储在这里
$ CREATE EXTERNAL TABLE t_external (year string, month int, num int)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LOCATION '/user/test/t_extends'
此时在 HDFS 上创建了空目录 /user/test/t_extends
同时,在 MySQL 中记录的 Hive 元数据中添加了表的信息:
mysql > select * from TBLS\G;
- *************************** 5. row ***************************
- TBL_ID: 21
- CREATE_TIME: 1663561577
- DB_ID: 1
- LAST_ACCESS_TIME: 0
- OWNER: hadoop
- OWNER_TYPE: USER
- RETENTION: 0
- SD_ID: 21
- TBL_NAME: t_external
- TBL_TYPE: EXTERNAL_TABLE
- VIEW_EXPANDED_TEXT: NULL
- VIEW_ORIGINAL_TEXT: NULL
- IS_REWRITE_ENABLED:
- 5 rows in set (0.00 sec)
$ vim t_extends.txt
- 2019,02,15
- 2020,10,01
- 2021,03,19
- 2022,01,10
- 2022,03,21
- 2022,09,19
可用:
$ hdfs dfs -put t_extends.txt /user/test/t_extends
也可以用:
hive> load data local inpath '/home/centosm/test/hive' into table t_external
加载数据后文件被导入到 /user/test/t_extends/ 目录下:
$ hdfs dfs -ls /user/test/t_extends/
- Found 1 items
- -rw-r--r-- 3 hadoop supergroup 66 2022-09-19 14:18 /user/test/t_extends/t_extends.txt
$ hdfs dfs -cat /user/test/t_extends/t_extends.txt
- 2022-09-19 14:43:42,163 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
- 2019,02,15
- 2020,10,01
- 2021,03,19
- 2022,01,10
- 2022,03,21
- 2022,09,19
hive> select * from t_external;
- OK
- 2019 2 15
- 2020 10 1
- 2021 3 19
- 2022 1 10
- 2022 3 21
- 2022 9 19
- Time taken: 3.177 seconds, Fetched: 6 row(s)
hive> select INPU__FILE__NAME form t_external;
- OK
- hdfs://mycluster/user/test/t_extends/t_extends.txt
- hdfs://mycluster/user/test/t_extends/t_extends.txt
- hdfs://mycluster/user/test/t_extends/t_extends.txt
- hdfs://mycluster/user/test/t_extends/t_extends.txt
- hdfs://mycluster/user/test/t_extends/t_extends.txt
- hdfs://mycluster/user/test/t_extends/t_extends.txt
- Time taken: 0.213 seconds, Fetched: 6 row(s)
如果删除表,表的元信息被删除,因为是外部表,数据文件没有被删除:
hive> drop table t_external;
MySQL 中元数据被删除:
mysql> select * from TBLS where TBL_NAME='t_external';
Empty set (0.00 sec)
数据文件还在:
$ hdfs dfs -ls /user/test/t_extends/
- Found 1 items
- -rw-r--r-- 3 hadoop supergroup 66 2022-09-19 14:18 /user/test/t_extends/t_extends.txt