Friday, December 1, 2017

Login as Keberos user(principal) after Hive enabled with Kerberos authentication

[donghua@cdh-vm log]$ klist
Ticket cache: FILE:/tmp/krb5cc_1000
Default principal: donghua@DBAGLOBE.COM

Valid starting       Expires              Service principal
12/01/2017 07:47:45  12/02/2017 07:47:45  krbtgt/DBAGLOBE.COM@DBAGLOBE.COM


[donghua@cdh-vm log]$ beeline
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0
Beeline version 1.1.0-cdh5.13.0 by Apache Hive
beeline> ! connect jdbc:hive2://cdh-vm.dbaglobe.com:10000/default;user=donghua@DBAGLOBE.COM;principal=hive/cdh-vm.dbaglobe.com@DBAGLOBE.COM 

====================Remarks from Donghua================

user=donghua@DBAGLOBE.COM is not required/valid, it will be silently ignored.

correct connection URL as below:

connect jdbc:hive2://cdh-vm.dbaglobe.com:10000/default;principal=hive/cdh-vm.dbaglobe.com@DBAGLOBE.COM

Check the example for proxy user authentication:  https://github.com/apache/hive/blob/master/beeline/src/test/org/apache/hive/beeline/ProxyAuthTest.java

====================Remarks from Donghua================     
scan complete in 2ms
Connecting to jdbc:hive2://cdh-vm.dbaglobe.com:10000/default;user=donghua@DBAGLOBE.COM;principal=hive/cdh-vm.dbaglobe.com@DBAGLOBE.COM
Connected to: Apache Hive (version 1.1.0-cdh5.13.0)
Driver: Hive JDBC (version 1.1.0-cdh5.13.0)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://cdh-vm.dbaglobe.com:10000/def> select current_user();
INFO  : Compiling command(queryId=hive_20171201095151_96b71dfe-7fb2-4f42-b371-f377d1cfc4a3): select current_user()
INFO  : Semantic Analysis Completed
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, type:string, comment:null)], properties:null)
INFO  : Completed compiling command(queryId=hive_20171201095151_96b71dfe-7fb2-4f42-b371-f377d1cfc4a3); Time taken: 0.104 seconds
INFO  : Executing command(queryId=hive_20171201095151_96b71dfe-7fb2-4f42-b371-f377d1cfc4a3): select current_user()
INFO  : Completed executing command(queryId=hive_20171201095151_96b71dfe-7fb2-4f42-b371-f377d1cfc4a3); Time taken: 0.002 seconds
INFO  : OK
+----------+--+
|   _c0    |
+----------+--+
| donghua  |
+----------+--+

1 row selected (0.236 seconds)
0: jdbc:hive2://cdh-vm.dbaglobe.com:10000/def> show tables;
INFO  : Compiling command(queryId=hive_20171201095151_34ef3482-508d-483b-8477-16d8eb3e41ff): show tables
INFO  : Semantic Analysis Completed
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from deserializer)], properties:null)
INFO  : Completed compiling command(queryId=hive_20171201095151_34ef3482-508d-483b-8477-16d8eb3e41ff); Time taken: 0.01 seconds
INFO  : Executing command(queryId=hive_20171201095151_34ef3482-508d-483b-8477-16d8eb3e41ff): show tables
INFO  : Starting task [Stage-0:DDL] in serial mode
INFO  : Completed executing command(queryId=hive_20171201095151_34ef3482-508d-483b-8477-16d8eb3e41ff); Time taken: 0.028 seconds
INFO  : OK
+--------------+--+
|   tab_name   |
+--------------+--+
| iotdatademo  |
+--------------+--+

1 row selected (0.057 seconds)
0: jdbc:hive2://cdh-vm.dbaglobe.com:10000/def> desc iotdatademo;
INFO  : Compiling command(queryId=hive_20171201095151_9bdd46e2-6ab3-483c-b942-f498119841fe): desc iotdatademo
INFO  : Semantic Analysis Completed
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:col_name, type:string, comment:from deserializer), FieldSchema(name:data_type, type:string, comment:from deserializer), FieldSchema(name:comment, type:string, comment:from deserializer)], properties:null)
INFO  : Completed compiling command(queryId=hive_20171201095151_9bdd46e2-6ab3-483c-b942-f498119841fe); Time taken: 0.037 seconds
INFO  : Executing command(queryId=hive_20171201095151_9bdd46e2-6ab3-483c-b942-f498119841fe): desc iotdatademo
INFO  : Starting task [Stage-0:DDL] in serial mode
INFO  : Completed executing command(queryId=hive_20171201095151_9bdd46e2-6ab3-483c-b942-f498119841fe); Time taken: 0.014 seconds
INFO  : OK
+------------+------------+----------+--+
|  col_name  | data_type  | comment  |
+------------+------------+----------+--+
| stationid  | bigint     |          |
| date       | bigint     |          |
| datetime   | timestamp  |          |
| day        | bigint     |          |
| speed      | double     |          |
| volume     | bigint     |          |
+------------+------------+----------+--+

6 rows selected (0.087 seconds)
0: jdbc:hive2://cdh-vm.dbaglobe.com:10000/def> select day,count(*) from iotdatademo group by day order by day;
INFO  : Compiling command(queryId=hive_20171201095151_c400f069-2ede-4dab-86c8-113b3824d3d7): select day,count(*) from iotdatademo group by day order by day
INFO  : Semantic Analysis Completed
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:day, type:bigint, comment:null), FieldSchema(name:_c1, type:bigint, comment:null)], properties:null)
INFO  : Completed compiling command(queryId=hive_20171201095151_c400f069-2ede-4dab-86c8-113b3824d3d7); Time taken: 0.071 seconds
INFO  : Executing command(queryId=hive_20171201095151_c400f069-2ede-4dab-86c8-113b3824d3d7): select day,count(*) from iotdatademo group by day order by day
INFO  : Query ID = hive_20171201095151_c400f069-2ede-4dab-86c8-113b3824d3d7
INFO  : Total jobs = 2
INFO  : Launching Job 1 out of 2
INFO  : Starting task [Stage-1:MAPRED] in serial mode
INFO  : Number of reduce tasks not specified. Estimated from input data size: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=<number>
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=<number>
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1512125736892_0002
INFO  : Kind: HDFS_DELEGATION_TOKEN, Service: 192.168.56.10:8020, Ident: (token for donghua: HDFS_DELEGATION_TOKEN owner=donghua, renewer=yarn, realUser=hive/cdh-vm.dbaglobe.com@DBAGLOBE.COM, issueDate=1512139913090, maxDate=1512744713090, sequenceNumber=15, masterKeyId=4)
INFO  : Kind: HIVE_DELEGATION_TOKEN, Service: HiveServer2ImpersonationToken, Ident: 00 07 64 6f 6e 67 68 75 61 07 64 6f 6e 67 68 75 61 25 68 69 76 65 2f 63 64 68 2d 76 6d 2e 64 62 61 67 6c 6f 62 65 2e 63 6f 6d 40 44 42 41 47 4c 4f 42 45 2e 43 4f 4d 8a 01 60 12 8e af 59 8a 01 60 36 9b 33 59 04 01
INFO  : The url to track the job:
http://cdh-vm.dbaglobe.com:8088/proxy/application_1512125736892_0002/
INFO  : Starting Job = job_1512125736892_0002, Tracking URL = http://cdh-vm.dbaglobe.com:8088/proxy/application_1512125736892_0002/
INFO  : Kill Command = /opt/cloudera/parcels/CDH-5.13.0-1.cdh5.13.0.p0.29/lib/hadoop/bin/hadoop job  -kill job_1512125736892_0002
INFO  : Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
INFO  : 2017-12-01 09:52:03,269 Stage-1 map = 0%,  reduce = 0%
INFO  : 2017-12-01 09:52:09,589 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.43 sec
INFO  : 2017-12-01 09:52:15,870 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 4.29 sec
INFO  : MapReduce Total cumulative CPU time: 4 seconds 290 msec
INFO  : Ended Job = job_1512125736892_0002
INFO  : Launching Job 2 out of 2
INFO  : Starting task [Stage-2:MAPRED] in serial mode
INFO  : Number of reduce tasks determined at compile time: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=<number>
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=<number>
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1512125736892_0003
INFO  : Kind: HDFS_DELEGATION_TOKEN, Service: 192.168.56.10:8020, Ident: (token for donghua: HDFS_DELEGATION_TOKEN owner=donghua, renewer=yarn, realUser=hive/cdh-vm.dbaglobe.com@DBAGLOBE.COM, issueDate=1512139937049, maxDate=1512744737049, sequenceNumber=16, masterKeyId=4)
INFO  : Kind: HIVE_DELEGATION_TOKEN, Service: HiveServer2ImpersonationToken, Ident: 00 07 64 6f 6e 67 68 75 61 07 64 6f 6e 67 68 75 61 25 68 69 76 65 2f 63 64 68 2d 76 6d 2e 64 62 61 67 6c 6f 62 65 2e 63 6f 6d 40 44 42 41 47 4c 4f 42 45 2e 43 4f 4d 8a 01 60 12 8e af 59 8a 01 60 36 9b 33 59 04 01
INFO  : The url to track the job:
http://cdh-vm.dbaglobe.com:8088/proxy/application_1512125736892_0003/
INFO  : Starting Job = job_1512125736892_0003, Tracking URL = http://cdh-vm.dbaglobe.com:8088/proxy/application_1512125736892_0003/
INFO  : Kill Command = /opt/cloudera/parcels/CDH-5.13.0-1.cdh5.13.0.p0.29/lib/hadoop/bin/hadoop job  -kill job_1512125736892_0003
INFO  : Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 1
INFO  : 2017-12-01 09:52:25,667 Stage-2 map = 0%,  reduce = 0%
INFO  : 2017-12-01 09:52:30,914 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 1.07 sec
INFO  : 2017-12-01 09:52:38,226 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 3.01 sec
INFO  : MapReduce Total cumulative CPU time: 3 seconds 10 msec
INFO  : Ended Job = job_1512125736892_0003
INFO  : MapReduce Jobs Launched:
INFO  : Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 4.29 sec   HDFS Read: 46845578 HDFS Write: 250 SUCCESS
INFO  : Stage-Stage-2: Map: 1  Reduce: 1   Cumulative CPU: 3.01 sec   HDFS Read: 4924 HDFS Write: 59 SUCCESS
INFO  : Total MapReduce CPU Time Spent: 7 seconds 300 msec
INFO  : Completed executing command(queryId=hive_20171201095151_c400f069-2ede-4dab-86c8-113b3824d3d7); Time taken: 47.188 seconds
INFO  : OK
+------+---------+--+
| day  |   _c1   |
+------+---------+--+
| 0    | 172800  |
| 1    | 86410   |
| 2    | 86400   |
| 3    | 86400   |
| 4    | 86400   |
| 5    | 172800  |
| 6    | 172800  |
+------+---------+--+

7 rows selected (47.315 seconds)
0: jdbc:hive2://cdh-vm.dbaglobe.com:10000/def> !quit
Closing: 0: jdbc:hive2://cdh-vm.dbaglobe.com:10000/default;user=donghua@DBAGLOBE.COM;principal=hive/cdh-vm.dbaglobe.com@DBAGLOBE.COM

Reference: https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-JDBCClientSetupforaSecureCluster

JDBC Client Setup for a Secure Cluster

When connecting to HiveServer2 with Kerberos authentication, the URL format is:

jdbc:hive2://<host>:<port>/<db>;principal=<Server_Principal_of_HiveServer2>

The client needs to have a valid Kerberos ticket in the ticket cache before connecting.

NOTE: If you don't have a "/" after the port number, the jdbc driver does not parse the hostname and ends up running HS2 in embedded mode . So if you are specifying a hostname, make sure you have a "/" or "/<dbname>" after the port number.

In the case of LDAP, CUSTOM or PAM authentication, the client needs to pass a valid user name and password to the JDBC connection API.

To use sasl.qop, add the following to the sessionconf part of your Hive JDBC hive connection string, e.g.
jdbc:hive://hostname/dbname;sasl.qop=auth-int

For more information, see Setting Up HiveServer2.