hadoop 3.1.0
hive-3.1.3
tez 0.9.1
可以从hadoop命令行正确地访问s3a uri。我可以创建外部表和如下命令:
- create external table mytable(a string, b string) location 's3a://mybucket/myfolder/';
- select * from mytable limit 20;
执行正确,但是
select count(*) from mytable;
失败日志:
- INFO : Compiling command(queryId=root_20230919030746_7b38e3c8-8429-4d45-8a01-343bd26d8f6e): select count(*) from lyb0
- INFO : Concurrency mode is disabled, not creating a lock manager
- INFO : Semantic Analysis Completed (retrial = false)
- INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, type:bigint, comment:null)], properties:null)
- INFO : Completed compiling command(queryId=root_20230919030746_7b38e3c8-8429-4d45-8a01-343bd26d8f6e); Time taken: 0.257 seconds
- INFO : Concurrency mode is disabled, not creating a lock manager
- INFO : Executing command(queryId=root_20230919030746_7b38e3c8-8429-4d45-8a01-343bd26d8f6e): select count(*) from lyb0
- INFO : Query ID = root_20230919030746_7b38e3c8-8429-4d45-8a01-343bd26d8f6e
- INFO : Total jobs = 1
- INFO : Launching Job 1 out of 1
- INFO : Starting task [Stage-1:MAPRED] in serial mode
- INFO : Subscribed to counters: [] for queryId: root_20230919030746_7b38e3c8-8429-4d45-8a01-343bd26d8f6e
- INFO : Session is already open
- INFO : Dag name: select count(*) from lyb0 (Stage-1)
- INFO : Status: Running (Executing on YARN cluster with App id application_1695092793092_0001)
-
-
- ----------------------------------------------------------------------------------------------
- VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
- ----------------------------------------------------------------------------------------------
- Map 1 container INITIALIZING -1 0 0 -1 0 0
- Reducer 2 container INITED 1 0 0 1 0 0
- ----------------------------------------------------------------------------------------------
- VERTICES: 00/02 [>>--------------------------] 0% ELAPSED TIME: 9.55 s
- ----------------------------------------------------------------------------------------------
- ERROR : Status: Failed
-
- ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1695092793092_0001_3_00, diagnostics=[Vertex vertex_1695092793092_0001_3_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: lyb0 initializer failed, vertex=vertex_1695092793092_0001_3_00 [Map 1], org.apache.hadoop.fs.s3a.AWSClientIOException: doesBucketExist on hivesql: com.amazonaws.AmazonClientException: No AWS Credentials provided by SimpleAWSCredentialsProvider : org.apache.hadoop.fs.s3a.CredentialInitializationException: Access key or secret key is unset: No AWS Credentials provided by SimpleAWSCredentialsProvider : org.apache.hadoop.fs.s3a.CredentialInitializationException: Access key or secret key is unset
-
- at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:177)
- ……
- at java.lang.Thread.run(Thread.java:750)
-
- Caused by: com.amazonaws.AmazonClientException: No AWS Credentials provided by SimpleAWSCredentialsProvider : org.apache.hadoop.fs.s3a.CredentialInitializationException: Access key or secret key is unset
-
- at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:139)
- ……
- at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
-
- ... 31 more
-
- Caused by: org.apache.hadoop.fs.s3a.CredentialInitializationException: Access key or secret key is unset
-
- at org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider.getCredentials(SimpleAWSCredentialsProvider.java:75)
-
- at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:117)
-
- ... 45 more
-
- ]
-
- ERROR : Vertex killed, vertexName=Reducer 2, vertexId=vertex_1695092793092_0001_3_01, diagnostics=[Vertex received Kill in INITED state., Vertex vertex_1695092793092_0001_3_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]
-
- ERROR : DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
-
- ERROR : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1695092793092_0001_3_00, diagnostics=[Vertex vertex_1695092793092_0001_3_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: lyb0 initializer failed, vertex=vertex_1695092793092_0001_3_00 [Map 1], org.apache.hadoop.fs.s3a.AWSClientIOException: doesBucketExist on hivesql: com.amazonaws.AmazonClientException: No AWS Credentials provided by SimpleAWSCredentialsProvider : org.apache.hadoop.fs.s3a.CredentialInitializationException: Access key or secret key is unset: No AWS Credentials provided by SimpleAWSCredentialsProvider : org.apache.hadoop.fs.s3a.CredentialInitializationException: Access key or secret key is unset
-
- at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:177)
- ……
尝试将core-site.xml中的所有fs.s3a属性添加到tez-site.xml中,并在配置单元会话内设置fs,s3 a,access. key和fs.s3a.secret.key=,但仍出现相同错误。
确保未在tez-site.xml中设置tez.use.cluster.hadoop-libs,或者如果设置了,则值应为false
但是当设置为false时,tez无法运行。
当设置为true时,得到了aws凭据错误,即使在每个可能的位置或环境变量中设置了它们。
最终通过将这个属性添加到hive-site.xml中使它工作起来
- <property>
- <name>hive.conf.hidden.list</name>
- <value>javax.jdo.option.ConnectionPassword,hive.server2.keystore.password,fs.s3a.proxy.password,dfs.adls.oauth2.credential,fs.adl.oauth2.credential</value>
- </property>
这是正确的解决方案。但是,只是让你知道,现在你暴露了S3密钥密码在各种日志文件。一些文件,知道如下;
Hive-〉
Hadoop -〉1个内存6个内存1个
如果您有权访问源代码,则可以修改此方法,使其不在配置单元日志中生成上述属性。
参考资料: