donghua@cqlsh:mykeyspace> create table t3 (id int, d timestamp, primary key(id));
donghua@cqlsh:mykeyspace> desc t3;
CREATE TABLE mykeyspace.t3 (
id int PRIMARY KEY,
d timestamp
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
donghua@cqlsh:mykeyspace> insert into t3 (id,d) values (1,totimestamp(now()));
donghua@cqlsh:mykeyspace> insert into t3 (id,d) values (2,'2016-10-17T10:00:00');
donghua@cqlsh:mykeyspace> insert into t3 (id,d) values (3,'2016-10-17T10:00:00+0800');
donghua@cqlsh:mykeyspace> insert into t3 (id,d) values (4,'2016-10-17T10:00:00+0000');
donghua@cqlsh:mykeyspace> select * from t3;
id | d
----+---------------------------------
1 | 2016-10-17 10:20:35.416000+0000
2 | 2016-10-17 02:00:00.000000+0000
4 | 2016-10-17 10:00:00.000000+0000
3 | 2016-10-17 02:00:00.000000+0000
(4 rows)
donghua@cqlsh:mykeyspace> copy t3 TO '/tmp/t3-1.csv' with header=true;
Reading options from the command line: {'header': 'true'}
Using 1 child processes
Starting copy of mykeyspace.t3 with columns [id, d].
Processed: 4 rows; Rate: 5 rows/s; Avg. rate: 4 rows/s
4 rows exported to 1 files in 0.945 seconds.
ddonghua@cqlsh:mykeyspace> copy t3 TO '/tmp/t3-2.csv' with header=true and datetimeformat='%m/%d/%Y';
Reading options from the command line: {'datetimeformat': '%m/%d/%Y', 'header': 'true'}
Using 1 child processes
Starting copy of mykeyspace.t3 with columns [id, d].
Processed: 4 rows; Rate: 5 rows/s; Avg. rate: 4 rows/s
4 rows exported to 1 files in 1.070 seconds.
[donghua@localhost ~]$ cat /tmp/t3-1.csv
id,d
2,2016-10-17 02:00:00.000+0000
1,2016-10-17 10:20:35.416+0000
3,2016-10-17 02:00:00.000+0000
4,2016-10-17 10:00:00.000+0000
[donghua@localhost ~]$ cat /tmp/t3-2.csv
id,d
2,10/17/2016
1,10/17/2016
3,10/17/2016
4,10/17/2016
donghua@cqlsh:mykeyspace> truncate table t3;
donghua@cqlsh:mykeyspace> select * from t3;
id | d
----+---
(0 rows)
donghua@cqlsh:mykeyspace> copy t3 from '/tmp/t3-1.csv' with header=true;
Reading options from the command line: {'header': 'true'}
Using 1 child processes
Starting copy of mykeyspace.t3 with columns [id, d].
Processed: 4 rows; Rate: 4 rows/s; Avg. rate: 7 rows/s
4 rows imported from 1 files in 0.548 seconds (0 skipped).
donghua@cqlsh:mykeyspace> select * from t3;
id | d
----+---------------------------------
1 | 2016-10-17 10:20:35.416000+0000
2 | 2016-10-17 02:00:00.000000+0000
4 | 2016-10-17 10:00:00.000000+0000
3 | 2016-10-17 02:00:00.000000+0000
(4 rows)
donghua@cqlsh:mykeyspace> truncate table t3;
donghua@cqlsh:mykeyspace> copy t3 from '/tmp/t3-2.csv' with header=true and datetimeformat='%m/%d/%Y';
Reading options from the command line: {'datetimeformat': '%m/%d/%Y', 'header': 'true'}
Using 1 child processes
Starting copy of mykeyspace.t3 with columns [id, d].
Processed: 4 rows; Rate: 5 rows/s; Avg. rate: 7 rows/s
4 rows imported from 1 files in 0.544 seconds (0 skipped).
donghua@cqlsh:mykeyspace> select * from t3;
id | d
----+---------------------------------
1 | 2016-10-17 00:00:00.000000+0000
2 | 2016-10-17 00:00:00.000000+0000
4 | 2016-10-17 00:00:00.000000+0000
3 | 2016-10-17 00:00:00.000000+0000
(4 rows)
donghua@cqlsh:mykeyspace>
Monday, October 17, 2016
Thursday, October 13, 2016
CassandraSQLContext not found after spark-cassandra-connector upgrade to v2.0
scala> val cass=new CassandraSQLContext(sc)
val cass=new CassandraSQLContext(sc)
^
scala> import org.apache.spark.sql.cassandra.CassandraSQLContext
import org.apache.spark.sql.cassandra.CassandraSQLContext
^
==================
https://github.com/datastax/spark-cassandra-connector/blob/07b47effeb10480b4b80b2686d0e6874aefa0a24/CHANGES.txt
2.0.0 M1 | |
* Added support for left outer joins with C* table (SPARKC-181) | |
* Removed CassandraSqlContext and underscore based options (SPARKC-399) | |
* Upgrade to Spark 2.0.0-preview (SPARKC-396) | |
- Removed Twitter demo because there is no spark-streaming-twitter package available anymore | |
- Removed Akka Actor demo becaues there is no support for such streams anymore | |
- Bring back Kafka project and make it compile | |
- Update several classes to use our Logging instead of Spark Logging because Spark Logging became private | |
- Update few components and tests to make them work with Spark 2.0.0 | |
- Fix Spark SQL - temporarily | |
- Update plugins and Scala version | |
Intergrate Apache Spark with after enabling user authorization in Apache Casasndra
[donghua@localhost conf]$ $SPARK_HOME/bin/spark-sql --hiveconf spark.cassandra.connection.host=127.0.0.1 --hiveconf spark.cassandra.auth.username=donghua --hiveconf spark.cassandra.auth.password=new_pasword
16/10/13 21:44:09 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/10/13 21:44:23 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/10/13 21:44:23 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
16/10/13 21:44:25 WARN Utils: Your hostname, localhost.localdomain resolves to a loopback address: 127.0.0.1; using 10.0.2.15 instead (on interface enp0s3)
16/10/13 21:44:25 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
16/10/13 21:44:27 WARN SparkContext: Use an existing SparkContext, some configuration may not take effect.
spark-sql> create TEMPORARY TABLE t1 USING org.apache.spark.sql.cassandra OPTIONS (table "t1",keyspace "mykeyspace", cluster "Test Cluster",pushdown "true");
16/10/13 21:44:40 WARN SparkStrategies$DDLStrategy: CREATE TEMPORARY TABLE t1 USING... is deprecated, please use CREATE TEMPORARY VIEW viewName USING... instead
16/10/13 21:44:42 ERROR SparkSQLDriver: Failed in [create TEMPORARY TABLE t1 USING org.apache.spark.sql.cassandra OPTIONS (table "t1",keyspace "mykeyspace", cluster "Test Cluster",pushdown "true")]
com.datastax.driver.core.exceptions.UnauthorizedException: User donghua has no SELECT permission on or any of its parents
at com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:59)
at com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:25)
at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:64)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:47)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.datastax.spark.connector.cql.SessionProxy.invoke(SessionProxy.scala:33)
at com.sun.proxy.$Proxy29.execute(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.datastax.spark.connector.cql.SessionProxy.invoke(SessionProxy.scala:33)
at com.sun.proxy.$Proxy29.execute(Unknown Source)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates$$anonfun$tokenRanges$1.apply(DataSizeEstimates.scala:40)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates$$anonfun$tokenRanges$1.apply(DataSizeEstimates.scala:38)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:111)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:110)
at com.datastax.spark.connector.cql.CassandraConnector.closeResourceAfterUse(CassandraConnector.scala:140)
at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:110)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.tokenRanges$lzycompute(DataSizeEstimates.scala:38)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.tokenRanges(DataSizeEstimates.scala:37)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.totalDataSizeInBytes$lzycompute(DataSizeEstimates.scala:88)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.totalDataSizeInBytes(DataSizeEstimates.scala:87)
at org.apache.spark.sql.cassandra.CassandraSourceRelation$.apply(CassandraSourceRelation.scala:260)
at org.apache.spark.sql.cassandra.DefaultSource.createRelation(DefaultSource.scala:55)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:330)
at org.apache.spark.sql.execution.datasources.CreateTempViewUsing.run(ddl.scala:87)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:60)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:58)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86)
at org.apache.spark.sql.Dataset.(Dataset.scala:186)
at org.apache.spark.sql.Dataset.(Dataset.scala:167)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:65)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:682)
at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:62)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:331)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:247)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: com.datastax.driver.core.exceptions.UnauthorizedException: User donghua has no SELECT permission on or any of its parents
at com.datastax.driver.core.Responses$Error.asException(Responses.java:134)
at com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:179)
at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:173)
at com.datastax.driver.core.RequestHandler.access$2500(RequestHandler.java:43)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.setFinalResult(RequestHandler.java:788)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:607)
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1012)
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:935)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:276)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:263)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:823)
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:339)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:255)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112)
at java.lang.Thread.run(Thread.java:745)
com.datastax.driver.core.exceptions.UnauthorizedException: User donghua has no SELECT permission on or any of its parents
at com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:59)
at com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:25)
at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:64)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:47)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.datastax.spark.connector.cql.SessionProxy.invoke(SessionProxy.scala:33)
at com.sun.proxy.$Proxy29.execute(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.datastax.spark.connector.cql.SessionProxy.invoke(SessionProxy.scala:33)
at com.sun.proxy.$Proxy29.execute(Unknown Source)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates$$anonfun$tokenRanges$1.apply(DataSizeEstimates.scala:40)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates$$anonfun$tokenRanges$1.apply(DataSizeEstimates.scala:38)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:111)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:110)
at com.datastax.spark.connector.cql.CassandraConnector.closeResourceAfterUse(CassandraConnector.scala:140)
at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:110)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.tokenRanges$lzycompute(DataSizeEstimates.scala:38)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.tokenRanges(DataSizeEstimates.scala:37)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.totalDataSizeInBytes$lzycompute(DataSizeEstimates.scala:88)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.totalDataSizeInBytes(DataSizeEstimates.scala:87)
at org.apache.spark.sql.cassandra.CassandraSourceRelation$.apply(CassandraSourceRelation.scala:260)
at org.apache.spark.sql.cassandra.DefaultSource.createRelation(DefaultSource.scala:55)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:330)
at org.apache.spark.sql.execution.datasources.CreateTempViewUsing.run(ddl.scala:87)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:60)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:58)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86)
at org.apache.spark.sql.Dataset.(Dataset.scala:186)
at org.apache.spark.sql.Dataset.(Dataset.scala:167)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:65)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:682)
at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:62)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:331)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:247)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: com.datastax.driver.core.exceptions.UnauthorizedException: User donghua has no SELECT permission on or any of its parents
at com.datastax.driver.core.Responses$Error.asException(Responses.java:134)
at com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:179)
at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:173)
at com.datastax.driver.core.RequestHandler.access$2500(RequestHandler.java:43)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.setFinalResult(RequestHandler.java:788)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:607)
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1012)
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:935)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:276)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:263)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:823)
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:339)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:255)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112)
at java.lang.Thread.run(Thread.java:745)
Steps to fix:
cassandra@cqlsh> grant select on system.size_estimates to donghua;
cassandra@cqlsh> list all permissions of donghua;
role | username | resource | permission
------------------+------------------+-------------------------------+------------
donghua | donghua | | ALTER
donghua | donghua | | DROP
donghua | donghua | | SELECT
donghua | donghua | | MODIFY
donghua | donghua | | AUTHORIZE
donghua | donghua | | SELECT
mykeyspace_owner | mykeyspace_owner | | CREATE
mykeyspace_owner | mykeyspace_owner | | ALTER
mykeyspace_owner | mykeyspace_owner | | DROP
mykeyspace_owner | mykeyspace_owner | | SELECT
mykeyspace_owner | mykeyspace_owner | | MODIFY
mykeyspace_owner | mykeyspace_owner | | AUTHORIZE
(12 rows)
spark-sql> create TEMPORARY TABLE t1 USING org.apache.spark.sql.cassandra OPTIONS (table "t1",keyspace "mykeyspace", cluster "Test Cluster",pushdown "true");
16/10/13 21:46:46 WARN SparkStrategies$DDLStrategy: CREATE TEMPORARY TABLE t1 USING... is deprecated, please use CREATE TEMPORARY VIEW viewName USING... instead
Time taken: 2.462 seconds
spark-sql> select count(*) from t1;
445674
Time taken: 12.389 seconds, Fetched 1 row(s)
spark-sql>
16/10/13 21:44:09 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/10/13 21:44:23 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/10/13 21:44:23 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
16/10/13 21:44:25 WARN Utils: Your hostname, localhost.localdomain resolves to a loopback address: 127.0.0.1; using 10.0.2.15 instead (on interface enp0s3)
16/10/13 21:44:25 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
16/10/13 21:44:27 WARN SparkContext: Use an existing SparkContext, some configuration may not take effect.
spark-sql> create TEMPORARY TABLE t1 USING org.apache.spark.sql.cassandra OPTIONS (table "t1",keyspace "mykeyspace", cluster "Test Cluster",pushdown "true");
16/10/13 21:44:40 WARN SparkStrategies$DDLStrategy: CREATE TEMPORARY TABLE t1 USING... is deprecated, please use CREATE TEMPORARY VIEW viewName USING... instead
16/10/13 21:44:42 ERROR SparkSQLDriver: Failed in [create TEMPORARY TABLE t1 USING org.apache.spark.sql.cassandra OPTIONS (table "t1",keyspace "mykeyspace", cluster "Test Cluster",pushdown "true")]
com.datastax.driver.core.exceptions.UnauthorizedException: User donghua has no SELECT permission on or any of its parents
at com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:59)
at com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:25)
at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:64)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:47)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.datastax.spark.connector.cql.SessionProxy.invoke(SessionProxy.scala:33)
at com.sun.proxy.$Proxy29.execute(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.datastax.spark.connector.cql.SessionProxy.invoke(SessionProxy.scala:33)
at com.sun.proxy.$Proxy29.execute(Unknown Source)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates$$anonfun$tokenRanges$1.apply(DataSizeEstimates.scala:40)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates$$anonfun$tokenRanges$1.apply(DataSizeEstimates.scala:38)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:111)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:110)
at com.datastax.spark.connector.cql.CassandraConnector.closeResourceAfterUse(CassandraConnector.scala:140)
at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:110)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.tokenRanges$lzycompute(DataSizeEstimates.scala:38)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.tokenRanges(DataSizeEstimates.scala:37)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.totalDataSizeInBytes$lzycompute(DataSizeEstimates.scala:88)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.totalDataSizeInBytes(DataSizeEstimates.scala:87)
at org.apache.spark.sql.cassandra.CassandraSourceRelation$.apply(CassandraSourceRelation.scala:260)
at org.apache.spark.sql.cassandra.DefaultSource.createRelation(DefaultSource.scala:55)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:330)
at org.apache.spark.sql.execution.datasources.CreateTempViewUsing.run(ddl.scala:87)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:60)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:58)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86)
at org.apache.spark.sql.Dataset.
at org.apache.spark.sql.Dataset.
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:65)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:682)
at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:62)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:331)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:247)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: com.datastax.driver.core.exceptions.UnauthorizedException: User donghua has no SELECT permission on or any of its parents
at com.datastax.driver.core.Responses$Error.asException(Responses.java:134)
at com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:179)
at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:173)
at com.datastax.driver.core.RequestHandler.access$2500(RequestHandler.java:43)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.setFinalResult(RequestHandler.java:788)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:607)
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1012)
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:935)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:276)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:263)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:823)
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:339)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:255)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112)
at java.lang.Thread.run(Thread.java:745)
com.datastax.driver.core.exceptions.UnauthorizedException: User donghua has no SELECT permission on or any of its parents
at com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:59)
at com.datastax.driver.core.exceptions.UnauthorizedException.copy(UnauthorizedException.java:25)
at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:64)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:47)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.datastax.spark.connector.cql.SessionProxy.invoke(SessionProxy.scala:33)
at com.sun.proxy.$Proxy29.execute(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.datastax.spark.connector.cql.SessionProxy.invoke(SessionProxy.scala:33)
at com.sun.proxy.$Proxy29.execute(Unknown Source)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates$$anonfun$tokenRanges$1.apply(DataSizeEstimates.scala:40)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates$$anonfun$tokenRanges$1.apply(DataSizeEstimates.scala:38)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:111)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:110)
at com.datastax.spark.connector.cql.CassandraConnector.closeResourceAfterUse(CassandraConnector.scala:140)
at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:110)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.tokenRanges$lzycompute(DataSizeEstimates.scala:38)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.tokenRanges(DataSizeEstimates.scala:37)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.totalDataSizeInBytes$lzycompute(DataSizeEstimates.scala:88)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.totalDataSizeInBytes(DataSizeEstimates.scala:87)
at org.apache.spark.sql.cassandra.CassandraSourceRelation$.apply(CassandraSourceRelation.scala:260)
at org.apache.spark.sql.cassandra.DefaultSource.createRelation(DefaultSource.scala:55)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:330)
at org.apache.spark.sql.execution.datasources.CreateTempViewUsing.run(ddl.scala:87)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:60)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:58)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86)
at org.apache.spark.sql.Dataset.
at org.apache.spark.sql.Dataset.
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:65)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:682)
at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:62)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:331)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:247)
at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: com.datastax.driver.core.exceptions.UnauthorizedException: User donghua has no SELECT permission on or any of its parents
at com.datastax.driver.core.Responses$Error.asException(Responses.java:134)
at com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:179)
at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:173)
at com.datastax.driver.core.RequestHandler.access$2500(RequestHandler.java:43)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.setFinalResult(RequestHandler.java:788)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:607)
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1012)
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:935)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:276)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:263)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:823)
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:339)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:255)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112)
at java.lang.Thread.run(Thread.java:745)
Steps to fix:
cassandra@cqlsh> grant select on system.size_estimates to donghua;
cassandra@cqlsh> list all permissions of donghua;
role | username | resource | permission
------------------+------------------+-------------------------------+------------
donghua | donghua | | ALTER
donghua | donghua | | DROP
donghua | donghua | | SELECT
donghua | donghua | | MODIFY
donghua | donghua | | AUTHORIZE
donghua | donghua | | SELECT
mykeyspace_owner | mykeyspace_owner |
mykeyspace_owner | mykeyspace_owner |
mykeyspace_owner | mykeyspace_owner |
mykeyspace_owner | mykeyspace_owner |
mykeyspace_owner | mykeyspace_owner |
mykeyspace_owner | mykeyspace_owner |
(12 rows)
spark-sql> create TEMPORARY TABLE t1 USING org.apache.spark.sql.cassandra OPTIONS (table "t1",keyspace "mykeyspace", cluster "Test Cluster",pushdown "true");
16/10/13 21:46:46 WARN SparkStrategies$DDLStrategy: CREATE TEMPORARY TABLE t1 USING... is deprecated, please use CREATE TEMPORARY VIEW viewName USING... instead
Time taken: 2.462 seconds
spark-sql> select count(*) from t1;
445674
Time taken: 12.389 seconds, Fetched 1 row(s)
spark-sql>
How to fix OperationTimedOut when using cqlsh query large dataset
cqlsh:mykeyspace2> select d,min(e),max(e) from mykeyspace.t1;
OperationTimedOut: errors={'127.0.0.1': 'Client request timeout. See Session.execute[_async](timeout)'}, last_host=127.0.0.1
[donghua@localhost cluster]$ touch ~/.cassandra/cqlshrc
[donghua@localhost cluster]$ cat ~/.cassandra/cqlshrc
[connection]
client_timeout = 3600 # default is 10 seconds
# Can also be set to None to disable:
# client_timeout = None
[donghua@localhost cluster]$ cqlsh
Connected to SingleServer Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.9 | CQL spec 3.4.2 | Native protocol v4]
Use HELP for help.
cqlsh> select d,min(e),max(e) from mykeyspace.t1;
d | system.min(e) | system.max(e)
---+---------------+---------------
5 | 1.00033 | 159.99989
(1 rows)
OperationTimedOut: errors={'127.0.0.1': 'Client request timeout. See Session.execute[_async](timeout)'}, last_host=127.0.0.1
[donghua@localhost cluster]$ touch ~/.cassandra/cqlshrc
[donghua@localhost cluster]$ cat ~/.cassandra/cqlshrc
[connection]
client_timeout = 3600 # default is 10 seconds
# Can also be set to None to disable:
# client_timeout = None
[donghua@localhost cluster]$ cqlsh
Connected to SingleServer Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.9 | CQL spec 3.4.2 | Native protocol v4]
Use HELP for help.
cqlsh> select d,min(e),max(e) from mykeyspace.t1;
d | system.min(e) | system.max(e)
---+---------------+---------------
5 | 1.00033 | 159.99989
(1 rows)
Wednesday, October 12, 2016
How to include multiple JAR files for Spark-shell and spark-sql
#Below are additional JAR files required to connect Apache Cassandra
[donghua@localhost ~]$ ls -l /home/donghua/spark-casssandra-connector/*
-rw-rw-r--. 1 donghua donghua 231320 Oct 12 15:39 /home/donghua/spark-casssandra-connector/commons-beanutils-1.8.0.jar
-rw-rw-r--. 1 donghua donghua 38460 Oct 12 15:39 /home/donghua/spark-casssandra-connector/joda-convert-1.2.jar
-rw-rw-r--. 1 donghua donghua 581571 Oct 12 15:39 /home/donghua/spark-casssandra-connector/joda-time-2.3.jar
-rw-rw-r--. 1 donghua donghua 62226 Oct 12 15:40 /home/donghua/spark-casssandra-connector/jsr166e-1.1.0.jar
-rw-rw-r--. 1 donghua donghua 2112017 Oct 12 15:39 /home/donghua/spark-casssandra-connector/netty-all-4.0.33.Final.jar
-rw-rw-r--. 1 donghua donghua 4573750 Oct 12 15:41 /home/donghua/spark-casssandra-connector/scala-reflect-2.11.8.jar
-rw-rw-r--. 1 donghua donghua 6036067 Oct 12 15:39 /home/donghua/spark-casssandra-connector/spark-cassandra-connector-2.0.0-M2-s_2.11.jar
[donghua@localhost ~]$ cat spark-2.0.1-bin-hadoop2.7/conf/
docker.properties.template log4j.properties.template slaves.template spark-defaults.conf.template
fairscheduler.xml.template metrics.properties.template spark-defaults.conf spark-env.sh.template
[donghua@localhost ~]$ cat spark-2.0.1-bin-hadoop2.7/conf/spark-defaults.conf
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Default system properties included when running spark-submit.
# This is useful for setting default environmental settings.
# Example:
# spark.master spark://master:7077
# spark.eventLog.enabled true
# spark.eventLog.dir hdfs://namenode:8021/directory
# spark.serializer org.apache.spark.serializer.KryoSerializer
# spark.driver.memory 5g
# spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"
#
spark.driver.extraClassPath /home/donghua/spark-casssandra-connector/*
spark.executor.extraClassPath /home/donghua/spark-casssandra-connector/*
# To execute without using --jars to include additional jar files
[donghua@localhost spark-2.0.1-bin-hadoop2.7]$ $SPARK_HOME/bin/spark-shell --conf spark.cassandra.connection.host=127.0.0.1
Query Cassandra database using spark-sql shell from Apache Spark
[donghua@vmxdb01 ~]$ $SPARK_HOME/bin/spark-sql --packages datastax:spark-cassandra-connector:2.0.0-M2-s_2.11 --conf spark.cassandra.connection.host=127.0.0.1
Ivy Default Cache set to: /home/donghua/.ivy2/cache
The jars for the packages stored in: /home/donghua/.ivy2/jars
:: loading settings :: url = jar:file:/home/donghua/spark-2.0.1-bin-hadoop2.7/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
datastax#spark-cassandra-connector added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
found datastax#spark-cassandra-connector;2.0.0-M2-s_2.11 in spark-packages
found commons-beanutils#commons-beanutils;1.8.0 in central
found org.joda#joda-convert;1.2 in central
found joda-time#joda-time;2.3 in central
found io.netty#netty-all;4.0.33.Final in central
found com.twitter#jsr166e;1.1.0 in central
found org.scala-lang#scala-reflect;2.11.8 in central
:: resolution report :: resolve 869ms :: artifacts dl 15ms
:: modules in use:
com.twitter#jsr166e;1.1.0 from central in [default]
commons-beanutils#commons-beanutils;1.8.0 from central in [default]
datastax#spark-cassandra-connector;2.0.0-M2-s_2.11 from spark-packages in [default]
io.netty#netty-all;4.0.33.Final from central in [default]
joda-time#joda-time;2.3 from central in [default]
org.joda#joda-convert;1.2 from central in [default]
org.scala-lang#scala-reflect;2.11.8 from central in [default]
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 7 | 0 | 0 | 0 || 7 | 0 |
---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent
confs: [default]
0 artifacts copied, 7 already retrieved (0kB/8ms)
log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/10/12 08:15:01 INFO SparkContext: Running Spark version 2.0.1
16/10/12 08:15:02 INFO SecurityManager: Changing view acls to: donghua
16/10/12 08:15:02 INFO SecurityManager: Changing modify acls to: donghua
16/10/12 08:15:02 INFO SecurityManager: Changing view acls groups to:
16/10/12 08:15:02 INFO SecurityManager: Changing modify acls groups to:
16/10/12 08:15:02 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(donghua); groups with view permissions: Set(); users with modify permissions: Set(donghua); groups with modify permissions: Set()
16/10/12 08:15:02 INFO Utils: Successfully started service 'sparkDriver' on port 60027.
16/10/12 08:15:02 INFO SparkEnv: Registering MapOutputTracker
16/10/12 08:15:02 INFO SparkEnv: Registering BlockManagerMaster
16/10/12 08:15:02 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-20513a2d-c965-416d-81ae-9668df59d304
16/10/12 08:15:02 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
16/10/12 08:15:02 INFO SparkEnv: Registering OutputCommitCoordinator
16/10/12 08:15:02 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/10/12 08:15:02 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.6.151:4040
16/10/12 08:15:03 INFO SparkContext: Added JAR file:/home/donghua/.ivy2/jars/datastax_spark-cassandra-connector-2.0.0-M2-s_2.11.jar at spark://192.168.6.151:60027/jars/datastax_spark-cassandra-connector-2.0.0-M2-s_2.11.jar with timestamp 1476274503002
16/10/12 08:15:03 INFO SparkContext: Added JAR file:/home/donghua/.ivy2/jars/commons-beanutils_commons-beanutils-1.8.0.jar at spark://192.168.6.151:60027/jars/commons-beanutils_commons-beanutils-1.8.0.jar with timestamp 1476274503003
16/10/12 08:15:03 INFO SparkContext: Added JAR file:/home/donghua/.ivy2/jars/org.joda_joda-convert-1.2.jar at spark://192.168.6.151:60027/jars/org.joda_joda-convert-1.2.jar with timestamp 1476274503003
16/10/12 08:15:03 INFO SparkContext: Added JAR file:/home/donghua/.ivy2/jars/joda-time_joda-time-2.3.jar at spark://192.168.6.151:60027/jars/joda-time_joda-time-2.3.jar with timestamp 1476274503003
16/10/12 08:15:03 INFO SparkContext: Added JAR file:/home/donghua/.ivy2/jars/io.netty_netty-all-4.0.33.Final.jar at spark://192.168.6.151:60027/jars/io.netty_netty-all-4.0.33.Final.jar with timestamp 1476274503003
16/10/12 08:15:03 INFO SparkContext: Added JAR file:/home/donghua/.ivy2/jars/com.twitter_jsr166e-1.1.0.jar at spark://192.168.6.151:60027/jars/com.twitter_jsr166e-1.1.0.jar with timestamp 1476274503003
16/10/12 08:15:03 INFO SparkContext: Added JAR file:/home/donghua/.ivy2/jars/org.scala-lang_scala-reflect-2.11.8.jar at spark://192.168.6.151:60027/jars/org.scala-lang_scala-reflect-2.11.8.jar with timestamp 1476274503003
16/10/12 08:15:03 INFO Executor: Starting executor ID driver on host localhost
16/10/12 08:15:03 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 44882.
16/10/12 08:15:03 INFO NettyBlockTransferService: Server created on 192.168.6.151:44882
16/10/12 08:15:03 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.6.151, 44882)
16/10/12 08:15:03 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.6.151:44882 with 366.3 MB RAM, BlockManagerId(driver, 192.168.6.151, 44882)
16/10/12 08:15:03 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.6.151, 44882)
16/10/12 08:15:03 WARN SparkContext: Use an existing SparkContext, some configuration may not take effect.
16/10/12 08:15:03 INFO HiveSharedState: Warehouse path is '/home/donghua/spark-warehouse'.
16/10/12 08:15:03 INFO HiveUtils: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
16/10/12 08:15:04 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
16/10/12 08:15:04 INFO ObjectStore: ObjectStore, initialize called
16/10/12 08:15:04 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
16/10/12 08:15:04 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
16/10/12 08:15:05 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
16/10/12 08:15:06 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/10/12 08:15:06 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/10/12 08:15:06 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/10/12 08:15:06 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/10/12 08:15:07 INFO Query: Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is closing
16/10/12 08:15:07 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
16/10/12 08:15:07 INFO ObjectStore: Initialized ObjectStore
16/10/12 08:15:07 INFO HiveMetaStore: Added admin role in metastore
16/10/12 08:15:07 INFO HiveMetaStore: Added public role in metastore
16/10/12 08:15:07 INFO HiveMetaStore: No user is added in admin role, since config is empty
16/10/12 08:15:07 INFO HiveMetaStore: 0: get_all_databases
16/10/12 08:15:07 INFO audit: ugi=donghua ip=unknown-ip-addr cmd=get_all_databases
16/10/12 08:15:07 INFO HiveMetaStore: 0: get_functions: db=default pat=*
16/10/12 08:15:07 INFO audit: ugi=donghua ip=unknown-ip-addr cmd=get_functions: db=default pat=*
16/10/12 08:15:07 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
16/10/12 08:15:07 INFO SessionState: Created local directory: /tmp/c37fe7b1-45a3-4ce1-b115-ea341dfe013b_resources
16/10/12 08:15:07 INFO SessionState: Created HDFS directory: /tmp/hive/donghua/c37fe7b1-45a3-4ce1-b115-ea341dfe013b
16/10/12 08:15:07 INFO SessionState: Created local directory: /tmp/donghua/c37fe7b1-45a3-4ce1-b115-ea341dfe013b
16/10/12 08:15:07 INFO SessionState: Created HDFS directory: /tmp/hive/donghua/c37fe7b1-45a3-4ce1-b115-ea341dfe013b/_tmp_space.db
16/10/12 08:15:07 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.1) is /home/donghua/spark-warehouse
16/10/12 08:15:07 INFO SessionState: Created local directory: /tmp/228512c3-917e-4070-b28d-3de3f688b746_resources
16/10/12 08:15:08 INFO SessionState: Created HDFS directory: /tmp/hive/donghua/228512c3-917e-4070-b28d-3de3f688b746
16/10/12 08:15:08 INFO SessionState: Created local directory: /tmp/donghua/228512c3-917e-4070-b28d-3de3f688b746
16/10/12 08:15:08 INFO SessionState: Created HDFS directory: /tmp/hive/donghua/228512c3-917e-4070-b28d-3de3f688b746/_tmp_space.db
16/10/12 08:15:08 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.1) is /home/donghua/spark-warehouse
spark-sql> select * from mykeyspace.kv;
16/10/12 08:15:12 INFO SparkSqlParser: Parsing command: select * from mykeyspace.kv
16/10/12 08:15:13 INFO HiveMetaStore: 0: create_database: Database(name:default, description:default database, locationUri:file:/home/donghua/spark-warehouse, parameters:{})
16/10/12 08:15:13 INFO audit: ugi=donghua ip=unknown-ip-addr cmd=create_database: Database(name:default, description:default database, locationUri:file:/home/donghua/spark-warehouse, parameters:{})
16/10/12 08:15:13 INFO HiveMetaStore: 0: get_database: mykeyspace
16/10/12 08:15:13 INFO audit: ugi=donghua ip=unknown-ip-addr cmd=get_database: mykeyspace
16/10/12 08:15:13 WARN ObjectStore: Failed to get database mykeyspace, returning NoSuchObjectException
Error in query: Table or view not found: `mykeyspace`.`kv`; line 1 pos 14
spark-sql> create TEMPORARY TABLE kv USING org.apache.spark.sql.cassandra OPTIONS (table "kv",keyspace "mykeyspace", cluster "Test Cluster",pushdown "true");
16/10/12 08:15:19 INFO SparkSqlParser: Parsing command: create TEMPORARY TABLE kv USING org.apache.spark.sql.cassandra OPTIONS (table "kv",keyspace "mykeyspace", cluster "Test Cluster",pushdown "true")
16/10/12 08:15:20 WARN SparkStrategies$DDLStrategy: CREATE TEMPORARY TABLE kv USING... is deprecated, please use CREATE TEMPORARY VIEW viewName USING... instead
16/10/12 08:15:21 INFO NettyUtil: Found Netty's native epoll transport in the classpath, using it
16/10/12 08:15:22 INFO Cluster: New Cassandra host /127.0.0.1:9042 added
16/10/12 08:15:22 INFO CassandraConnector: Connected to Cassandra cluster: Test Cluster
16/10/12 08:15:22 INFO SparkContext: Starting job: processCmd at CliDriver.java:376
16/10/12 08:15:22 INFO DAGScheduler: Got job 0 (processCmd at CliDriver.java:376) with 1 output partitions
16/10/12 08:15:22 INFO DAGScheduler: Final stage: ResultStage 0 (processCmd at CliDriver.java:376)
16/10/12 08:15:22 INFO DAGScheduler: Parents of final stage: List()
16/10/12 08:15:22 INFO DAGScheduler: Missing parents: List()
16/10/12 08:15:22 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[3] at processCmd at CliDriver.java:376), which has no missing parents
16/10/12 08:15:22 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 3.2 KB, free 366.3 MB)
16/10/12 08:15:23 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1964.0 B, free 366.3 MB)
16/10/12 08:15:23 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.6.151:44882 (size: 1964.0 B, free: 366.3 MB)
16/10/12 08:15:23 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1012
16/10/12 08:15:23 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[3] at processCmd at CliDriver.java:376)
16/10/12 08:15:23 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
16/10/12 08:15:23 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, partition 0, PROCESS_LOCAL, 6128 bytes)
16/10/12 08:15:23 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
16/10/12 08:15:23 INFO Executor: Fetching spark://192.168.6.151:60027/jars/com.twitter_jsr166e-1.1.0.jar with timestamp 1476274503003
16/10/12 08:15:23 INFO TransportClientFactory: Successfully created connection to /192.168.6.151:60027 after 62 ms (0 ms spent in bootstraps)
16/10/12 08:15:23 INFO Utils: Fetching spark://192.168.6.151:60027/jars/com.twitter_jsr166e-1.1.0.jar to /tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/fetchFileTemp1249905734998620423.tmp
16/10/12 08:15:23 INFO Executor: Adding file:/tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/com.twitter_jsr166e-1.1.0.jar to class loader
16/10/12 08:15:23 INFO Executor: Fetching spark://192.168.6.151:60027/jars/datastax_spark-cassandra-connector-2.0.0-M2-s_2.11.jar with timestamp 1476274503002
16/10/12 08:15:23 INFO Utils: Fetching spark://192.168.6.151:60027/jars/datastax_spark-cassandra-connector-2.0.0-M2-s_2.11.jar to /tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/fetchFileTemp1156901688704158624.tmp
16/10/12 08:15:24 INFO Executor: Adding file:/tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/datastax_spark-cassandra-connector-2.0.0-M2-s_2.11.jar to class loader
16/10/12 08:15:24 INFO Executor: Fetching spark://192.168.6.151:60027/jars/org.joda_joda-convert-1.2.jar with timestamp 1476274503003
16/10/12 08:15:24 INFO Utils: Fetching spark://192.168.6.151:60027/jars/org.joda_joda-convert-1.2.jar to /tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/fetchFileTemp3155581321752809377.tmp
16/10/12 08:15:24 INFO Executor: Adding file:/tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/org.joda_joda-convert-1.2.jar to class loader
16/10/12 08:15:24 INFO Executor: Fetching spark://192.168.6.151:60027/jars/commons-beanutils_commons-beanutils-1.8.0.jar with timestamp 1476274503003
16/10/12 08:15:24 INFO Utils: Fetching spark://192.168.6.151:60027/jars/commons-beanutils_commons-beanutils-1.8.0.jar to /tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/fetchFileTemp3073542072392987481.tmp
16/10/12 08:15:24 INFO Executor: Adding file:/tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/commons-beanutils_commons-beanutils-1.8.0.jar to class loader
16/10/12 08:15:24 INFO Executor: Fetching spark://192.168.6.151:60027/jars/org.scala-lang_scala-reflect-2.11.8.jar with timestamp 1476274503003
16/10/12 08:15:24 INFO Utils: Fetching spark://192.168.6.151:60027/jars/org.scala-lang_scala-reflect-2.11.8.jar to /tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/fetchFileTemp8208827726672004865.tmp
16/10/12 08:15:24 INFO Executor: Adding file:/tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/org.scala-lang_scala-reflect-2.11.8.jar to class loader
16/10/12 08:15:24 INFO Executor: Fetching spark://192.168.6.151:60027/jars/joda-time_joda-time-2.3.jar with timestamp 1476274503003
16/10/12 08:15:24 INFO Utils: Fetching spark://192.168.6.151:60027/jars/joda-time_joda-time-2.3.jar to /tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/fetchFileTemp3706194028917481017.tmp
16/10/12 08:15:24 INFO Executor: Adding file:/tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/joda-time_joda-time-2.3.jar to class loader
16/10/12 08:15:24 INFO Executor: Fetching spark://192.168.6.151:60027/jars/io.netty_netty-all-4.0.33.Final.jar with timestamp 1476274503003
16/10/12 08:15:24 INFO Utils: Fetching spark://192.168.6.151:60027/jars/io.netty_netty-all-4.0.33.Final.jar to /tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/fetchFileTemp5973767129005243936.tmp
16/10/12 08:15:24 INFO Executor: Adding file:/tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/io.netty_netty-all-4.0.33.Final.jar to class loader
16/10/12 08:15:24 INFO CodeGenerator: Code generated in 270.608582 ms
16/10/12 08:15:24 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1064 bytes result sent to driver
16/10/12 08:15:24 INFO DAGScheduler: ResultStage 0 (processCmd at CliDriver.java:376) finished in 1.701 s
16/10/12 08:15:24 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 1664 ms on localhost (1/1)
16/10/12 08:15:24 INFO DAGScheduler: Job 0 finished: processCmd at CliDriver.java:376, took 2.311670 s
16/10/12 08:15:24 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
Time taken: 5.907 seconds
16/10/12 08:15:24 INFO CliDriver: Time taken: 5.907 seconds
16/10/12 08:15:29 INFO CassandraConnector: Disconnected from Cassandra cluster: Test Cluster
spark-sql> select * from kv;
16/10/12 08:15:43 INFO SparkSqlParser: Parsing command: select * from kv
16/10/12 08:15:43 INFO CassandraSourceRelation: Input Predicates: []
16/10/12 08:15:43 INFO CassandraSourceRelation: Input Predicates: []
16/10/12 08:15:43 INFO CodeGenerator: Code generated in 57.196131 ms
16/10/12 08:15:43 INFO Cluster: New Cassandra host /127.0.0.1:9042 added
16/10/12 08:15:43 INFO CassandraConnector: Connected to Cassandra cluster: Test Cluster
16/10/12 08:15:44 INFO SparkContext: Starting job: processCmd at CliDriver.java:376
16/10/12 08:15:44 INFO DAGScheduler: Got job 1 (processCmd at CliDriver.java:376) with 6 output partitions
16/10/12 08:15:44 INFO DAGScheduler: Final stage: ResultStage 1 (processCmd at CliDriver.java:376)
16/10/12 08:15:44 INFO DAGScheduler: Parents of final stage: List()
16/10/12 08:15:44 INFO DAGScheduler: Missing parents: List()
16/10/12 08:15:44 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[7] at processCmd at CliDriver.java:376), which has no missing parents
16/10/12 08:15:44 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 11.6 KB, free 366.3 MB)
16/10/12 08:15:44 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 6.0 KB, free 366.3 MB)
16/10/12 08:15:44 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.6.151:44882 (size: 6.0 KB, free: 366.3 MB)
16/10/12 08:15:44 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1012
16/10/12 08:15:44 INFO DAGScheduler: Submitting 6 missing tasks from ResultStage 1 (MapPartitionsRDD[7] at processCmd at CliDriver.java:376)
16/10/12 08:15:44 INFO TaskSchedulerImpl: Adding task set 1.0 with 6 tasks
16/10/12 08:15:44 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, localhost, partition 0, NODE_LOCAL, 13410 bytes)
16/10/12 08:15:44 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 2, localhost, partition 1, NODE_LOCAL, 14240 bytes)
16/10/12 08:15:44 INFO Executor: Running task 0.0 in stage 1.0 (TID 1)
16/10/12 08:15:44 INFO Executor: Running task 1.0 in stage 1.0 (TID 2)
16/10/12 08:15:45 INFO Executor: Finished task 1.0 in stage 1.0 (TID 2). 1071 bytes result sent to driver
16/10/12 08:15:45 INFO TaskSetManager: Starting task 2.0 in stage 1.0 (TID 3, localhost, partition 2, NODE_LOCAL, 13887 bytes)
16/10/12 08:15:45 INFO Executor: Running task 2.0 in stage 1.0 (TID 3)
16/10/12 08:15:45 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 2) in 610 ms on localhost (1/6)
16/10/12 08:15:45 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). 1071 bytes result sent to driver
16/10/12 08:15:45 INFO TaskSetManager: Starting task 3.0 in stage 1.0 (TID 4, localhost, partition 3, NODE_LOCAL, 13055 bytes)
16/10/12 08:15:45 INFO Executor: Running task 3.0 in stage 1.0 (TID 4)
16/10/12 08:15:45 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 1373 ms on localhost (2/6)
16/10/12 08:15:45 INFO Executor: Finished task 2.0 in stage 1.0 (TID 3). 1333 bytes result sent to driver
16/10/12 08:15:46 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 192.168.6.151:44882 in memory (size: 1964.0 B, free: 366.3 MB)
16/10/12 08:15:46 INFO TaskSetManager: Starting task 4.0 in stage 1.0 (TID 5, localhost, partition 4, NODE_LOCAL, 13170 bytes)
16/10/12 08:15:46 INFO Executor: Running task 4.0 in stage 1.0 (TID 5)
16/10/12 08:15:46 INFO TaskSetManager: Finished task 2.0 in stage 1.0 (TID 3) in 846 ms on localhost (3/6)
16/10/12 08:15:46 INFO Executor: Finished task 3.0 in stage 1.0 (TID 4). 1362 bytes result sent to driver
16/10/12 08:15:46 INFO TaskSetManager: Starting task 5.0 in stage 1.0 (TID 6, localhost, partition 5, NODE_LOCAL, 8066 bytes)
16/10/12 08:15:46 INFO Executor: Running task 5.0 in stage 1.0 (TID 6)
16/10/12 08:15:46 INFO TaskSetManager: Finished task 3.0 in stage 1.0 (TID 4) in 658 ms on localhost (4/6)
16/10/12 08:15:46 INFO Executor: Finished task 4.0 in stage 1.0 (TID 5). 1071 bytes result sent to driver
16/10/12 08:15:46 INFO TaskSetManager: Finished task 4.0 in stage 1.0 (TID 5) in 476 ms on localhost (5/6)
16/10/12 08:15:46 INFO Executor: Finished task 5.0 in stage 1.0 (TID 6). 1071 bytes result sent to driver
16/10/12 08:15:46 INFO TaskSetManager: Finished task 5.0 in stage 1.0 (TID 6) in 174 ms on localhost (6/6)
16/10/12 08:15:46 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool
16/10/12 08:15:46 INFO DAGScheduler: ResultStage 1 (processCmd at CliDriver.java:376) finished in 1.949 s
16/10/12 08:15:46 INFO DAGScheduler: Job 1 finished: processCmd at CliDriver.java:376, took 1.994766 s
key1 1
key4 4
key3 3
key2 2
Time taken: 3.044 seconds, Fetched 4 row(s)
16/10/12 08:15:46 INFO CliDriver: Time taken: 3.044 seconds, Fetched 4 row(s)
16/10/12 08:15:53 INFO CassandraConnector: Disconnected from Cassandra cluster: Test Cluster
spark-sql> desc kv;
16/10/12 08:16:24 INFO SparkSqlParser: Parsing command: desc kv
16/10/12 08:16:24 INFO SparkContext: Starting job: processCmd at CliDriver.java:376
16/10/12 08:16:24 INFO DAGScheduler: Got job 2 (processCmd at CliDriver.java:376) with 1 output partitions
16/10/12 08:16:24 INFO DAGScheduler: Final stage: ResultStage 2 (processCmd at CliDriver.java:376)
16/10/12 08:16:24 INFO DAGScheduler: Parents of final stage: List()
16/10/12 08:16:24 INFO DAGScheduler: Missing parents: List()
16/10/12 08:16:24 INFO DAGScheduler: Submitting ResultStage 2 (MapPartitionsRDD[10] at processCmd at CliDriver.java:376), which has no missing parents
16/10/12 08:16:24 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 4.2 KB, free 366.3 MB)
16/10/12 08:16:24 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 2.5 KB, free 366.3 MB)
16/10/12 08:16:24 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.6.151:44882 (size: 2.5 KB, free: 366.3 MB)
16/10/12 08:16:24 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1012
16/10/12 08:16:24 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 2 (MapPartitionsRDD[10] at processCmd at CliDriver.java:376)
16/10/12 08:16:24 INFO TaskSchedulerImpl: Adding task set 2.0 with 1 tasks
16/10/12 08:16:25 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 7, localhost, partition 0, PROCESS_LOCAL, 6167 bytes)
16/10/12 08:16:25 INFO Executor: Running task 0.0 in stage 2.0 (TID 7)
16/10/12 08:16:25 INFO CodeGenerator: Code generated in 16.186786 ms
16/10/12 08:16:25 INFO Executor: Finished task 0.0 in stage 2.0 (TID 7). 1051 bytes result sent to driver
16/10/12 08:16:25 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 7) in 69 ms on localhost (1/1)
16/10/12 08:16:25 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool
16/10/12 08:16:25 INFO DAGScheduler: ResultStage 2 (processCmd at CliDriver.java:376) finished in 0.070 s
16/10/12 08:16:25 INFO DAGScheduler: Job 2 finished: processCmd at CliDriver.java:376, took 0.083123 s
key string NULL
value int NULL
Time taken: 0.128 seconds, Fetched 2 row(s)
16/10/12 08:16:25 INFO CliDriver: Time taken: 0.128 seconds, Fetched 2 row(s)
spark-sql> select * from kv where value > 2;
16/10/12 08:16:34 INFO SparkSqlParser: Parsing command: select * from kv where value > 2
16/10/12 08:16:34 INFO CassandraSourceRelation: Input Predicates: [IsNotNull(value), GreaterThan(value,2)]
16/10/12 08:16:34 INFO CassandraSourceRelation: Input Predicates: [IsNotNull(value), GreaterThan(value,2)]
16/10/12 08:16:34 INFO CodeGenerator: Code generated in 14.815303 ms
16/10/12 08:16:34 INFO Cluster: New Cassandra host localhost/127.0.0.1:9042 added
16/10/12 08:16:34 INFO CassandraConnector: Connected to Cassandra cluster: Test Cluster
16/10/12 08:16:35 INFO SparkContext: Starting job: processCmd at CliDriver.java:376
16/10/12 08:16:35 INFO DAGScheduler: Got job 3 (processCmd at CliDriver.java:376) with 6 output partitions
16/10/12 08:16:35 INFO DAGScheduler: Final stage: ResultStage 3 (processCmd at CliDriver.java:376)
16/10/12 08:16:35 INFO DAGScheduler: Parents of final stage: List()
16/10/12 08:16:35 INFO DAGScheduler: Missing parents: List()
16/10/12 08:16:35 INFO DAGScheduler: Submitting ResultStage 3 (MapPartitionsRDD[14] at processCmd at CliDriver.java:376), which has no missing parents
16/10/12 08:16:35 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 12.4 KB, free 366.3 MB)
16/10/12 08:16:35 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 6.2 KB, free 366.3 MB)
16/10/12 08:16:35 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on 192.168.6.151:44882 (size: 6.2 KB, free: 366.3 MB)
16/10/12 08:16:35 INFO SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:1012
16/10/12 08:16:35 INFO DAGScheduler: Submitting 6 missing tasks from ResultStage 3 (MapPartitionsRDD[14] at processCmd at CliDriver.java:376)
16/10/12 08:16:35 INFO TaskSchedulerImpl: Adding task set 3.0 with 6 tasks
16/10/12 08:16:35 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 8, localhost, partition 0, NODE_LOCAL, 13426 bytes)
16/10/12 08:16:35 INFO TaskSetManager: Starting task 1.0 in stage 3.0 (TID 9, localhost, partition 1, NODE_LOCAL, 14256 bytes)
16/10/12 08:16:35 INFO Executor: Running task 0.0 in stage 3.0 (TID 8)
16/10/12 08:16:35 INFO Executor: Running task 1.0 in stage 3.0 (TID 9)
16/10/12 08:16:35 INFO Executor: Finished task 0.0 in stage 3.0 (TID 8). 1133 bytes result sent to driver
16/10/12 08:16:35 INFO TaskSetManager: Starting task 2.0 in stage 3.0 (TID 10, localhost, partition 2, NODE_LOCAL, 13903 bytes)
16/10/12 08:16:35 INFO TaskSetManager: Finished task 0.0 in stage 3.0 (TID 8) in 481 ms on localhost (1/6)
16/10/12 08:16:35 INFO Executor: Running task 2.0 in stage 3.0 (TID 10)
16/10/12 08:16:35 INFO Executor: Finished task 1.0 in stage 3.0 (TID 9). 1133 bytes result sent to driver
16/10/12 08:16:35 INFO TaskSetManager: Starting task 3.0 in stage 3.0 (TID 11, localhost, partition 3, NODE_LOCAL, 13071 bytes)
16/10/12 08:16:35 INFO TaskSetManager: Finished task 1.0 in stage 3.0 (TID 9) in 545 ms on localhost (2/6)
16/10/12 08:16:35 INFO Executor: Running task 3.0 in stage 3.0 (TID 11)
16/10/12 08:16:36 INFO Executor: Finished task 3.0 in stage 3.0 (TID 11). 1336 bytes result sent to driver
16/10/12 08:16:36 INFO TaskSetManager: Starting task 4.0 in stage 3.0 (TID 12, localhost, partition 4, NODE_LOCAL, 13186 bytes)
16/10/12 08:16:36 INFO TaskSetManager: Finished task 3.0 in stage 3.0 (TID 11) in 535 ms on localhost (3/6)
16/10/12 08:16:36 INFO Executor: Running task 4.0 in stage 3.0 (TID 12)
16/10/12 08:16:36 INFO Executor: Finished task 2.0 in stage 3.0 (TID 10). 1293 bytes result sent to driver
16/10/12 08:16:36 INFO TaskSetManager: Starting task 5.0 in stage 3.0 (TID 13, localhost, partition 5, NODE_LOCAL, 8082 bytes)
16/10/12 08:16:36 INFO Executor: Running task 5.0 in stage 3.0 (TID 13)
16/10/12 08:16:36 INFO TaskSetManager: Finished task 2.0 in stage 3.0 (TID 10) in 626 ms on localhost (4/6)
16/10/12 08:16:36 INFO Executor: Finished task 5.0 in stage 3.0 (TID 13). 1133 bytes result sent to driver
16/10/12 08:16:36 INFO TaskSetManager: Finished task 5.0 in stage 3.0 (TID 13) in 103 ms on localhost (5/6)
16/10/12 08:16:36 INFO Executor: Finished task 4.0 in stage 3.0 (TID 12). 1133 bytes result sent to driver
16/10/12 08:16:36 INFO TaskSetManager: Finished task 4.0 in stage 3.0 (TID 12) in 327 ms on localhost (6/6)
16/10/12 08:16:36 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose tasks have all completed, from pool
16/10/12 08:16:36 INFO DAGScheduler: ResultStage 3 (processCmd at CliDriver.java:376) finished in 1.406 s
16/10/12 08:16:36 INFO DAGScheduler: Job 3 finished: processCmd at CliDriver.java:376, took 1.449255 s
key4 4
key3 3
Time taken: 1.958 seconds, Fetched 2 row(s)
16/10/12 08:16:36 INFO CliDriver: Time taken: 1.958 seconds, Fetched 2 row(s)
spark-sql> exit;
16/10/12 08:16:41 INFO SparkUI: Stopped Spark web UI at http://192.168.6.151:4040
16/10/12 08:16:41 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/10/12 08:16:41 INFO MemoryStore: MemoryStore cleared
16/10/12 08:16:41 INFO BlockManager: BlockManager stopped
16/10/12 08:16:41 INFO BlockManagerMaster: BlockManagerMaster stopped
16/10/12 08:16:41 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/10/12 08:16:41 INFO SparkContext: Successfully stopped SparkContext
16/10/12 08:16:41 INFO ShutdownHookManager: Shutdown hook called
16/10/12 08:16:41 INFO ShutdownHookManager: Deleting directory /tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139
16/10/12 08:16:41 INFO ShutdownHookManager: Deleting directory /tmp/spark-be3c499a-b678-43d1-8a0d-cdb7236428cd
16/10/12 08:16:43 INFO CassandraConnector: Disconnected from Cassandra cluster: Test Cluster
16/10/12 08:16:43 INFO SerialShutdownHooks: Successfully executed shutdown hook: Clearing session cache for C* connector
================== Turn off verbose output to make the output cleaner ==========
[donghua@vmxdb01 ~]$ diff spark-2.0.1-bin-hadoop2.7/conf/log4j.properties.template spark-2.0.1-bin-hadoop2.7/conf/log4j.properties
19c19
< log4j.rootCategory=INFO, console
---
> log4j.rootCategory=WARN, console
spark-sql> select * from kv where value > 2;
Error in query: Table or view not found: kv; line 1 pos 14
spark-sql> create TEMPORARY TABLE kv USING org.apache.spark.sql.cassandra OPTIONS (table "kv",keyspace "mykeyspace", cluster "Test Cluster",pushdown "true");
16/10/12 08:28:09 WARN SparkStrategies$DDLStrategy: CREATE TEMPORARY TABLE kv USING... is deprecated, please use CREATE TEMPORARY VIEW viewName USING... instead
Time taken: 4.008 seconds
spark-sql> select * from kv;
key1 1
key4 4
key3 3
key2 2
Time taken: 2.253 seconds, Fetched 4 row(s)
spark-sql> select substring(key,1,3) from kv;
key
key
key
key
Time taken: 1.328 seconds, Fetched 4 row(s)
spark-sql> select substring(key,1,3),count(*) from kv group by substring(key,1,3);
key 4
Time taken: 3.518 seconds, Fetched 1 row(s)
spark-sql>
Ivy Default Cache set to: /home/donghua/.ivy2/cache
The jars for the packages stored in: /home/donghua/.ivy2/jars
:: loading settings :: url = jar:file:/home/donghua/spark-2.0.1-bin-hadoop2.7/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
datastax#spark-cassandra-connector added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
found datastax#spark-cassandra-connector;2.0.0-M2-s_2.11 in spark-packages
found commons-beanutils#commons-beanutils;1.8.0 in central
found org.joda#joda-convert;1.2 in central
found joda-time#joda-time;2.3 in central
found io.netty#netty-all;4.0.33.Final in central
found com.twitter#jsr166e;1.1.0 in central
found org.scala-lang#scala-reflect;2.11.8 in central
:: resolution report :: resolve 869ms :: artifacts dl 15ms
:: modules in use:
com.twitter#jsr166e;1.1.0 from central in [default]
commons-beanutils#commons-beanutils;1.8.0 from central in [default]
datastax#spark-cassandra-connector;2.0.0-M2-s_2.11 from spark-packages in [default]
io.netty#netty-all;4.0.33.Final from central in [default]
joda-time#joda-time;2.3 from central in [default]
org.joda#joda-convert;1.2 from central in [default]
org.scala-lang#scala-reflect;2.11.8 from central in [default]
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 7 | 0 | 0 | 0 || 7 | 0 |
---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent
confs: [default]
0 artifacts copied, 7 already retrieved (0kB/8ms)
log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/10/12 08:15:01 INFO SparkContext: Running Spark version 2.0.1
16/10/12 08:15:02 INFO SecurityManager: Changing view acls to: donghua
16/10/12 08:15:02 INFO SecurityManager: Changing modify acls to: donghua
16/10/12 08:15:02 INFO SecurityManager: Changing view acls groups to:
16/10/12 08:15:02 INFO SecurityManager: Changing modify acls groups to:
16/10/12 08:15:02 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(donghua); groups with view permissions: Set(); users with modify permissions: Set(donghua); groups with modify permissions: Set()
16/10/12 08:15:02 INFO Utils: Successfully started service 'sparkDriver' on port 60027.
16/10/12 08:15:02 INFO SparkEnv: Registering MapOutputTracker
16/10/12 08:15:02 INFO SparkEnv: Registering BlockManagerMaster
16/10/12 08:15:02 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-20513a2d-c965-416d-81ae-9668df59d304
16/10/12 08:15:02 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
16/10/12 08:15:02 INFO SparkEnv: Registering OutputCommitCoordinator
16/10/12 08:15:02 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/10/12 08:15:02 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.6.151:4040
16/10/12 08:15:03 INFO SparkContext: Added JAR file:/home/donghua/.ivy2/jars/datastax_spark-cassandra-connector-2.0.0-M2-s_2.11.jar at spark://192.168.6.151:60027/jars/datastax_spark-cassandra-connector-2.0.0-M2-s_2.11.jar with timestamp 1476274503002
16/10/12 08:15:03 INFO SparkContext: Added JAR file:/home/donghua/.ivy2/jars/commons-beanutils_commons-beanutils-1.8.0.jar at spark://192.168.6.151:60027/jars/commons-beanutils_commons-beanutils-1.8.0.jar with timestamp 1476274503003
16/10/12 08:15:03 INFO SparkContext: Added JAR file:/home/donghua/.ivy2/jars/org.joda_joda-convert-1.2.jar at spark://192.168.6.151:60027/jars/org.joda_joda-convert-1.2.jar with timestamp 1476274503003
16/10/12 08:15:03 INFO SparkContext: Added JAR file:/home/donghua/.ivy2/jars/joda-time_joda-time-2.3.jar at spark://192.168.6.151:60027/jars/joda-time_joda-time-2.3.jar with timestamp 1476274503003
16/10/12 08:15:03 INFO SparkContext: Added JAR file:/home/donghua/.ivy2/jars/io.netty_netty-all-4.0.33.Final.jar at spark://192.168.6.151:60027/jars/io.netty_netty-all-4.0.33.Final.jar with timestamp 1476274503003
16/10/12 08:15:03 INFO SparkContext: Added JAR file:/home/donghua/.ivy2/jars/com.twitter_jsr166e-1.1.0.jar at spark://192.168.6.151:60027/jars/com.twitter_jsr166e-1.1.0.jar with timestamp 1476274503003
16/10/12 08:15:03 INFO SparkContext: Added JAR file:/home/donghua/.ivy2/jars/org.scala-lang_scala-reflect-2.11.8.jar at spark://192.168.6.151:60027/jars/org.scala-lang_scala-reflect-2.11.8.jar with timestamp 1476274503003
16/10/12 08:15:03 INFO Executor: Starting executor ID driver on host localhost
16/10/12 08:15:03 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 44882.
16/10/12 08:15:03 INFO NettyBlockTransferService: Server created on 192.168.6.151:44882
16/10/12 08:15:03 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.6.151, 44882)
16/10/12 08:15:03 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.6.151:44882 with 366.3 MB RAM, BlockManagerId(driver, 192.168.6.151, 44882)
16/10/12 08:15:03 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.6.151, 44882)
16/10/12 08:15:03 WARN SparkContext: Use an existing SparkContext, some configuration may not take effect.
16/10/12 08:15:03 INFO HiveSharedState: Warehouse path is '/home/donghua/spark-warehouse'.
16/10/12 08:15:03 INFO HiveUtils: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
16/10/12 08:15:04 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
16/10/12 08:15:04 INFO ObjectStore: ObjectStore, initialize called
16/10/12 08:15:04 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
16/10/12 08:15:04 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
16/10/12 08:15:05 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
16/10/12 08:15:06 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/10/12 08:15:06 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/10/12 08:15:06 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/10/12 08:15:06 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/10/12 08:15:07 INFO Query: Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is closing
16/10/12 08:15:07 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
16/10/12 08:15:07 INFO ObjectStore: Initialized ObjectStore
16/10/12 08:15:07 INFO HiveMetaStore: Added admin role in metastore
16/10/12 08:15:07 INFO HiveMetaStore: Added public role in metastore
16/10/12 08:15:07 INFO HiveMetaStore: No user is added in admin role, since config is empty
16/10/12 08:15:07 INFO HiveMetaStore: 0: get_all_databases
16/10/12 08:15:07 INFO audit: ugi=donghua ip=unknown-ip-addr cmd=get_all_databases
16/10/12 08:15:07 INFO HiveMetaStore: 0: get_functions: db=default pat=*
16/10/12 08:15:07 INFO audit: ugi=donghua ip=unknown-ip-addr cmd=get_functions: db=default pat=*
16/10/12 08:15:07 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
16/10/12 08:15:07 INFO SessionState: Created local directory: /tmp/c37fe7b1-45a3-4ce1-b115-ea341dfe013b_resources
16/10/12 08:15:07 INFO SessionState: Created HDFS directory: /tmp/hive/donghua/c37fe7b1-45a3-4ce1-b115-ea341dfe013b
16/10/12 08:15:07 INFO SessionState: Created local directory: /tmp/donghua/c37fe7b1-45a3-4ce1-b115-ea341dfe013b
16/10/12 08:15:07 INFO SessionState: Created HDFS directory: /tmp/hive/donghua/c37fe7b1-45a3-4ce1-b115-ea341dfe013b/_tmp_space.db
16/10/12 08:15:07 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.1) is /home/donghua/spark-warehouse
16/10/12 08:15:07 INFO SessionState: Created local directory: /tmp/228512c3-917e-4070-b28d-3de3f688b746_resources
16/10/12 08:15:08 INFO SessionState: Created HDFS directory: /tmp/hive/donghua/228512c3-917e-4070-b28d-3de3f688b746
16/10/12 08:15:08 INFO SessionState: Created local directory: /tmp/donghua/228512c3-917e-4070-b28d-3de3f688b746
16/10/12 08:15:08 INFO SessionState: Created HDFS directory: /tmp/hive/donghua/228512c3-917e-4070-b28d-3de3f688b746/_tmp_space.db
16/10/12 08:15:08 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.1) is /home/donghua/spark-warehouse
spark-sql> select * from mykeyspace.kv;
16/10/12 08:15:12 INFO SparkSqlParser: Parsing command: select * from mykeyspace.kv
16/10/12 08:15:13 INFO HiveMetaStore: 0: create_database: Database(name:default, description:default database, locationUri:file:/home/donghua/spark-warehouse, parameters:{})
16/10/12 08:15:13 INFO audit: ugi=donghua ip=unknown-ip-addr cmd=create_database: Database(name:default, description:default database, locationUri:file:/home/donghua/spark-warehouse, parameters:{})
16/10/12 08:15:13 INFO HiveMetaStore: 0: get_database: mykeyspace
16/10/12 08:15:13 INFO audit: ugi=donghua ip=unknown-ip-addr cmd=get_database: mykeyspace
16/10/12 08:15:13 WARN ObjectStore: Failed to get database mykeyspace, returning NoSuchObjectException
Error in query: Table or view not found: `mykeyspace`.`kv`; line 1 pos 14
spark-sql> create TEMPORARY TABLE kv USING org.apache.spark.sql.cassandra OPTIONS (table "kv",keyspace "mykeyspace", cluster "Test Cluster",pushdown "true");
16/10/12 08:15:19 INFO SparkSqlParser: Parsing command: create TEMPORARY TABLE kv USING org.apache.spark.sql.cassandra OPTIONS (table "kv",keyspace "mykeyspace", cluster "Test Cluster",pushdown "true")
16/10/12 08:15:20 WARN SparkStrategies$DDLStrategy: CREATE TEMPORARY TABLE kv USING... is deprecated, please use CREATE TEMPORARY VIEW viewName USING... instead
16/10/12 08:15:21 INFO NettyUtil: Found Netty's native epoll transport in the classpath, using it
16/10/12 08:15:22 INFO Cluster: New Cassandra host /127.0.0.1:9042 added
16/10/12 08:15:22 INFO CassandraConnector: Connected to Cassandra cluster: Test Cluster
16/10/12 08:15:22 INFO SparkContext: Starting job: processCmd at CliDriver.java:376
16/10/12 08:15:22 INFO DAGScheduler: Got job 0 (processCmd at CliDriver.java:376) with 1 output partitions
16/10/12 08:15:22 INFO DAGScheduler: Final stage: ResultStage 0 (processCmd at CliDriver.java:376)
16/10/12 08:15:22 INFO DAGScheduler: Parents of final stage: List()
16/10/12 08:15:22 INFO DAGScheduler: Missing parents: List()
16/10/12 08:15:22 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[3] at processCmd at CliDriver.java:376), which has no missing parents
16/10/12 08:15:22 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 3.2 KB, free 366.3 MB)
16/10/12 08:15:23 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1964.0 B, free 366.3 MB)
16/10/12 08:15:23 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.6.151:44882 (size: 1964.0 B, free: 366.3 MB)
16/10/12 08:15:23 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1012
16/10/12 08:15:23 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[3] at processCmd at CliDriver.java:376)
16/10/12 08:15:23 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
16/10/12 08:15:23 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, partition 0, PROCESS_LOCAL, 6128 bytes)
16/10/12 08:15:23 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
16/10/12 08:15:23 INFO Executor: Fetching spark://192.168.6.151:60027/jars/com.twitter_jsr166e-1.1.0.jar with timestamp 1476274503003
16/10/12 08:15:23 INFO TransportClientFactory: Successfully created connection to /192.168.6.151:60027 after 62 ms (0 ms spent in bootstraps)
16/10/12 08:15:23 INFO Utils: Fetching spark://192.168.6.151:60027/jars/com.twitter_jsr166e-1.1.0.jar to /tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/fetchFileTemp1249905734998620423.tmp
16/10/12 08:15:23 INFO Executor: Adding file:/tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/com.twitter_jsr166e-1.1.0.jar to class loader
16/10/12 08:15:23 INFO Executor: Fetching spark://192.168.6.151:60027/jars/datastax_spark-cassandra-connector-2.0.0-M2-s_2.11.jar with timestamp 1476274503002
16/10/12 08:15:23 INFO Utils: Fetching spark://192.168.6.151:60027/jars/datastax_spark-cassandra-connector-2.0.0-M2-s_2.11.jar to /tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/fetchFileTemp1156901688704158624.tmp
16/10/12 08:15:24 INFO Executor: Adding file:/tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/datastax_spark-cassandra-connector-2.0.0-M2-s_2.11.jar to class loader
16/10/12 08:15:24 INFO Executor: Fetching spark://192.168.6.151:60027/jars/org.joda_joda-convert-1.2.jar with timestamp 1476274503003
16/10/12 08:15:24 INFO Utils: Fetching spark://192.168.6.151:60027/jars/org.joda_joda-convert-1.2.jar to /tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/fetchFileTemp3155581321752809377.tmp
16/10/12 08:15:24 INFO Executor: Adding file:/tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/org.joda_joda-convert-1.2.jar to class loader
16/10/12 08:15:24 INFO Executor: Fetching spark://192.168.6.151:60027/jars/commons-beanutils_commons-beanutils-1.8.0.jar with timestamp 1476274503003
16/10/12 08:15:24 INFO Utils: Fetching spark://192.168.6.151:60027/jars/commons-beanutils_commons-beanutils-1.8.0.jar to /tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/fetchFileTemp3073542072392987481.tmp
16/10/12 08:15:24 INFO Executor: Adding file:/tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/commons-beanutils_commons-beanutils-1.8.0.jar to class loader
16/10/12 08:15:24 INFO Executor: Fetching spark://192.168.6.151:60027/jars/org.scala-lang_scala-reflect-2.11.8.jar with timestamp 1476274503003
16/10/12 08:15:24 INFO Utils: Fetching spark://192.168.6.151:60027/jars/org.scala-lang_scala-reflect-2.11.8.jar to /tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/fetchFileTemp8208827726672004865.tmp
16/10/12 08:15:24 INFO Executor: Adding file:/tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/org.scala-lang_scala-reflect-2.11.8.jar to class loader
16/10/12 08:15:24 INFO Executor: Fetching spark://192.168.6.151:60027/jars/joda-time_joda-time-2.3.jar with timestamp 1476274503003
16/10/12 08:15:24 INFO Utils: Fetching spark://192.168.6.151:60027/jars/joda-time_joda-time-2.3.jar to /tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/fetchFileTemp3706194028917481017.tmp
16/10/12 08:15:24 INFO Executor: Adding file:/tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/joda-time_joda-time-2.3.jar to class loader
16/10/12 08:15:24 INFO Executor: Fetching spark://192.168.6.151:60027/jars/io.netty_netty-all-4.0.33.Final.jar with timestamp 1476274503003
16/10/12 08:15:24 INFO Utils: Fetching spark://192.168.6.151:60027/jars/io.netty_netty-all-4.0.33.Final.jar to /tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/fetchFileTemp5973767129005243936.tmp
16/10/12 08:15:24 INFO Executor: Adding file:/tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139/userFiles-b34a93b4-cffc-4ec3-bcdb-1ce2823ac122/io.netty_netty-all-4.0.33.Final.jar to class loader
16/10/12 08:15:24 INFO CodeGenerator: Code generated in 270.608582 ms
16/10/12 08:15:24 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1064 bytes result sent to driver
16/10/12 08:15:24 INFO DAGScheduler: ResultStage 0 (processCmd at CliDriver.java:376) finished in 1.701 s
16/10/12 08:15:24 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 1664 ms on localhost (1/1)
16/10/12 08:15:24 INFO DAGScheduler: Job 0 finished: processCmd at CliDriver.java:376, took 2.311670 s
16/10/12 08:15:24 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
Time taken: 5.907 seconds
16/10/12 08:15:24 INFO CliDriver: Time taken: 5.907 seconds
16/10/12 08:15:29 INFO CassandraConnector: Disconnected from Cassandra cluster: Test Cluster
spark-sql> select * from kv;
16/10/12 08:15:43 INFO SparkSqlParser: Parsing command: select * from kv
16/10/12 08:15:43 INFO CassandraSourceRelation: Input Predicates: []
16/10/12 08:15:43 INFO CassandraSourceRelation: Input Predicates: []
16/10/12 08:15:43 INFO CodeGenerator: Code generated in 57.196131 ms
16/10/12 08:15:43 INFO Cluster: New Cassandra host /127.0.0.1:9042 added
16/10/12 08:15:43 INFO CassandraConnector: Connected to Cassandra cluster: Test Cluster
16/10/12 08:15:44 INFO SparkContext: Starting job: processCmd at CliDriver.java:376
16/10/12 08:15:44 INFO DAGScheduler: Got job 1 (processCmd at CliDriver.java:376) with 6 output partitions
16/10/12 08:15:44 INFO DAGScheduler: Final stage: ResultStage 1 (processCmd at CliDriver.java:376)
16/10/12 08:15:44 INFO DAGScheduler: Parents of final stage: List()
16/10/12 08:15:44 INFO DAGScheduler: Missing parents: List()
16/10/12 08:15:44 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[7] at processCmd at CliDriver.java:376), which has no missing parents
16/10/12 08:15:44 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 11.6 KB, free 366.3 MB)
16/10/12 08:15:44 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 6.0 KB, free 366.3 MB)
16/10/12 08:15:44 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.6.151:44882 (size: 6.0 KB, free: 366.3 MB)
16/10/12 08:15:44 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1012
16/10/12 08:15:44 INFO DAGScheduler: Submitting 6 missing tasks from ResultStage 1 (MapPartitionsRDD[7] at processCmd at CliDriver.java:376)
16/10/12 08:15:44 INFO TaskSchedulerImpl: Adding task set 1.0 with 6 tasks
16/10/12 08:15:44 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, localhost, partition 0, NODE_LOCAL, 13410 bytes)
16/10/12 08:15:44 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 2, localhost, partition 1, NODE_LOCAL, 14240 bytes)
16/10/12 08:15:44 INFO Executor: Running task 0.0 in stage 1.0 (TID 1)
16/10/12 08:15:44 INFO Executor: Running task 1.0 in stage 1.0 (TID 2)
16/10/12 08:15:45 INFO Executor: Finished task 1.0 in stage 1.0 (TID 2). 1071 bytes result sent to driver
16/10/12 08:15:45 INFO TaskSetManager: Starting task 2.0 in stage 1.0 (TID 3, localhost, partition 2, NODE_LOCAL, 13887 bytes)
16/10/12 08:15:45 INFO Executor: Running task 2.0 in stage 1.0 (TID 3)
16/10/12 08:15:45 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 2) in 610 ms on localhost (1/6)
16/10/12 08:15:45 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). 1071 bytes result sent to driver
16/10/12 08:15:45 INFO TaskSetManager: Starting task 3.0 in stage 1.0 (TID 4, localhost, partition 3, NODE_LOCAL, 13055 bytes)
16/10/12 08:15:45 INFO Executor: Running task 3.0 in stage 1.0 (TID 4)
16/10/12 08:15:45 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 1373 ms on localhost (2/6)
16/10/12 08:15:45 INFO Executor: Finished task 2.0 in stage 1.0 (TID 3). 1333 bytes result sent to driver
16/10/12 08:15:46 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 192.168.6.151:44882 in memory (size: 1964.0 B, free: 366.3 MB)
16/10/12 08:15:46 INFO TaskSetManager: Starting task 4.0 in stage 1.0 (TID 5, localhost, partition 4, NODE_LOCAL, 13170 bytes)
16/10/12 08:15:46 INFO Executor: Running task 4.0 in stage 1.0 (TID 5)
16/10/12 08:15:46 INFO TaskSetManager: Finished task 2.0 in stage 1.0 (TID 3) in 846 ms on localhost (3/6)
16/10/12 08:15:46 INFO Executor: Finished task 3.0 in stage 1.0 (TID 4). 1362 bytes result sent to driver
16/10/12 08:15:46 INFO TaskSetManager: Starting task 5.0 in stage 1.0 (TID 6, localhost, partition 5, NODE_LOCAL, 8066 bytes)
16/10/12 08:15:46 INFO Executor: Running task 5.0 in stage 1.0 (TID 6)
16/10/12 08:15:46 INFO TaskSetManager: Finished task 3.0 in stage 1.0 (TID 4) in 658 ms on localhost (4/6)
16/10/12 08:15:46 INFO Executor: Finished task 4.0 in stage 1.0 (TID 5). 1071 bytes result sent to driver
16/10/12 08:15:46 INFO TaskSetManager: Finished task 4.0 in stage 1.0 (TID 5) in 476 ms on localhost (5/6)
16/10/12 08:15:46 INFO Executor: Finished task 5.0 in stage 1.0 (TID 6). 1071 bytes result sent to driver
16/10/12 08:15:46 INFO TaskSetManager: Finished task 5.0 in stage 1.0 (TID 6) in 174 ms on localhost (6/6)
16/10/12 08:15:46 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool
16/10/12 08:15:46 INFO DAGScheduler: ResultStage 1 (processCmd at CliDriver.java:376) finished in 1.949 s
16/10/12 08:15:46 INFO DAGScheduler: Job 1 finished: processCmd at CliDriver.java:376, took 1.994766 s
key1 1
key4 4
key3 3
key2 2
Time taken: 3.044 seconds, Fetched 4 row(s)
16/10/12 08:15:46 INFO CliDriver: Time taken: 3.044 seconds, Fetched 4 row(s)
16/10/12 08:15:53 INFO CassandraConnector: Disconnected from Cassandra cluster: Test Cluster
spark-sql> desc kv;
16/10/12 08:16:24 INFO SparkSqlParser: Parsing command: desc kv
16/10/12 08:16:24 INFO SparkContext: Starting job: processCmd at CliDriver.java:376
16/10/12 08:16:24 INFO DAGScheduler: Got job 2 (processCmd at CliDriver.java:376) with 1 output partitions
16/10/12 08:16:24 INFO DAGScheduler: Final stage: ResultStage 2 (processCmd at CliDriver.java:376)
16/10/12 08:16:24 INFO DAGScheduler: Parents of final stage: List()
16/10/12 08:16:24 INFO DAGScheduler: Missing parents: List()
16/10/12 08:16:24 INFO DAGScheduler: Submitting ResultStage 2 (MapPartitionsRDD[10] at processCmd at CliDriver.java:376), which has no missing parents
16/10/12 08:16:24 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 4.2 KB, free 366.3 MB)
16/10/12 08:16:24 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 2.5 KB, free 366.3 MB)
16/10/12 08:16:24 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.6.151:44882 (size: 2.5 KB, free: 366.3 MB)
16/10/12 08:16:24 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1012
16/10/12 08:16:24 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 2 (MapPartitionsRDD[10] at processCmd at CliDriver.java:376)
16/10/12 08:16:24 INFO TaskSchedulerImpl: Adding task set 2.0 with 1 tasks
16/10/12 08:16:25 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 7, localhost, partition 0, PROCESS_LOCAL, 6167 bytes)
16/10/12 08:16:25 INFO Executor: Running task 0.0 in stage 2.0 (TID 7)
16/10/12 08:16:25 INFO CodeGenerator: Code generated in 16.186786 ms
16/10/12 08:16:25 INFO Executor: Finished task 0.0 in stage 2.0 (TID 7). 1051 bytes result sent to driver
16/10/12 08:16:25 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 7) in 69 ms on localhost (1/1)
16/10/12 08:16:25 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool
16/10/12 08:16:25 INFO DAGScheduler: ResultStage 2 (processCmd at CliDriver.java:376) finished in 0.070 s
16/10/12 08:16:25 INFO DAGScheduler: Job 2 finished: processCmd at CliDriver.java:376, took 0.083123 s
key string NULL
value int NULL
Time taken: 0.128 seconds, Fetched 2 row(s)
16/10/12 08:16:25 INFO CliDriver: Time taken: 0.128 seconds, Fetched 2 row(s)
spark-sql> select * from kv where value > 2;
16/10/12 08:16:34 INFO SparkSqlParser: Parsing command: select * from kv where value > 2
16/10/12 08:16:34 INFO CassandraSourceRelation: Input Predicates: [IsNotNull(value), GreaterThan(value,2)]
16/10/12 08:16:34 INFO CassandraSourceRelation: Input Predicates: [IsNotNull(value), GreaterThan(value,2)]
16/10/12 08:16:34 INFO CodeGenerator: Code generated in 14.815303 ms
16/10/12 08:16:34 INFO Cluster: New Cassandra host localhost/127.0.0.1:9042 added
16/10/12 08:16:34 INFO CassandraConnector: Connected to Cassandra cluster: Test Cluster
16/10/12 08:16:35 INFO SparkContext: Starting job: processCmd at CliDriver.java:376
16/10/12 08:16:35 INFO DAGScheduler: Got job 3 (processCmd at CliDriver.java:376) with 6 output partitions
16/10/12 08:16:35 INFO DAGScheduler: Final stage: ResultStage 3 (processCmd at CliDriver.java:376)
16/10/12 08:16:35 INFO DAGScheduler: Parents of final stage: List()
16/10/12 08:16:35 INFO DAGScheduler: Missing parents: List()
16/10/12 08:16:35 INFO DAGScheduler: Submitting ResultStage 3 (MapPartitionsRDD[14] at processCmd at CliDriver.java:376), which has no missing parents
16/10/12 08:16:35 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 12.4 KB, free 366.3 MB)
16/10/12 08:16:35 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 6.2 KB, free 366.3 MB)
16/10/12 08:16:35 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on 192.168.6.151:44882 (size: 6.2 KB, free: 366.3 MB)
16/10/12 08:16:35 INFO SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:1012
16/10/12 08:16:35 INFO DAGScheduler: Submitting 6 missing tasks from ResultStage 3 (MapPartitionsRDD[14] at processCmd at CliDriver.java:376)
16/10/12 08:16:35 INFO TaskSchedulerImpl: Adding task set 3.0 with 6 tasks
16/10/12 08:16:35 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 8, localhost, partition 0, NODE_LOCAL, 13426 bytes)
16/10/12 08:16:35 INFO TaskSetManager: Starting task 1.0 in stage 3.0 (TID 9, localhost, partition 1, NODE_LOCAL, 14256 bytes)
16/10/12 08:16:35 INFO Executor: Running task 0.0 in stage 3.0 (TID 8)
16/10/12 08:16:35 INFO Executor: Running task 1.0 in stage 3.0 (TID 9)
16/10/12 08:16:35 INFO Executor: Finished task 0.0 in stage 3.0 (TID 8). 1133 bytes result sent to driver
16/10/12 08:16:35 INFO TaskSetManager: Starting task 2.0 in stage 3.0 (TID 10, localhost, partition 2, NODE_LOCAL, 13903 bytes)
16/10/12 08:16:35 INFO TaskSetManager: Finished task 0.0 in stage 3.0 (TID 8) in 481 ms on localhost (1/6)
16/10/12 08:16:35 INFO Executor: Running task 2.0 in stage 3.0 (TID 10)
16/10/12 08:16:35 INFO Executor: Finished task 1.0 in stage 3.0 (TID 9). 1133 bytes result sent to driver
16/10/12 08:16:35 INFO TaskSetManager: Starting task 3.0 in stage 3.0 (TID 11, localhost, partition 3, NODE_LOCAL, 13071 bytes)
16/10/12 08:16:35 INFO TaskSetManager: Finished task 1.0 in stage 3.0 (TID 9) in 545 ms on localhost (2/6)
16/10/12 08:16:35 INFO Executor: Running task 3.0 in stage 3.0 (TID 11)
16/10/12 08:16:36 INFO Executor: Finished task 3.0 in stage 3.0 (TID 11). 1336 bytes result sent to driver
16/10/12 08:16:36 INFO TaskSetManager: Starting task 4.0 in stage 3.0 (TID 12, localhost, partition 4, NODE_LOCAL, 13186 bytes)
16/10/12 08:16:36 INFO TaskSetManager: Finished task 3.0 in stage 3.0 (TID 11) in 535 ms on localhost (3/6)
16/10/12 08:16:36 INFO Executor: Running task 4.0 in stage 3.0 (TID 12)
16/10/12 08:16:36 INFO Executor: Finished task 2.0 in stage 3.0 (TID 10). 1293 bytes result sent to driver
16/10/12 08:16:36 INFO TaskSetManager: Starting task 5.0 in stage 3.0 (TID 13, localhost, partition 5, NODE_LOCAL, 8082 bytes)
16/10/12 08:16:36 INFO Executor: Running task 5.0 in stage 3.0 (TID 13)
16/10/12 08:16:36 INFO TaskSetManager: Finished task 2.0 in stage 3.0 (TID 10) in 626 ms on localhost (4/6)
16/10/12 08:16:36 INFO Executor: Finished task 5.0 in stage 3.0 (TID 13). 1133 bytes result sent to driver
16/10/12 08:16:36 INFO TaskSetManager: Finished task 5.0 in stage 3.0 (TID 13) in 103 ms on localhost (5/6)
16/10/12 08:16:36 INFO Executor: Finished task 4.0 in stage 3.0 (TID 12). 1133 bytes result sent to driver
16/10/12 08:16:36 INFO TaskSetManager: Finished task 4.0 in stage 3.0 (TID 12) in 327 ms on localhost (6/6)
16/10/12 08:16:36 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose tasks have all completed, from pool
16/10/12 08:16:36 INFO DAGScheduler: ResultStage 3 (processCmd at CliDriver.java:376) finished in 1.406 s
16/10/12 08:16:36 INFO DAGScheduler: Job 3 finished: processCmd at CliDriver.java:376, took 1.449255 s
key4 4
key3 3
Time taken: 1.958 seconds, Fetched 2 row(s)
16/10/12 08:16:36 INFO CliDriver: Time taken: 1.958 seconds, Fetched 2 row(s)
spark-sql> exit;
16/10/12 08:16:41 INFO SparkUI: Stopped Spark web UI at http://192.168.6.151:4040
16/10/12 08:16:41 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/10/12 08:16:41 INFO MemoryStore: MemoryStore cleared
16/10/12 08:16:41 INFO BlockManager: BlockManager stopped
16/10/12 08:16:41 INFO BlockManagerMaster: BlockManagerMaster stopped
16/10/12 08:16:41 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/10/12 08:16:41 INFO SparkContext: Successfully stopped SparkContext
16/10/12 08:16:41 INFO ShutdownHookManager: Shutdown hook called
16/10/12 08:16:41 INFO ShutdownHookManager: Deleting directory /tmp/spark-39508304-8579-4e51-ae16-e1cf8a22c139
16/10/12 08:16:41 INFO ShutdownHookManager: Deleting directory /tmp/spark-be3c499a-b678-43d1-8a0d-cdb7236428cd
16/10/12 08:16:43 INFO CassandraConnector: Disconnected from Cassandra cluster: Test Cluster
16/10/12 08:16:43 INFO SerialShutdownHooks: Successfully executed shutdown hook: Clearing session cache for C* connector
================== Turn off verbose output to make the output cleaner ==========
[donghua@vmxdb01 ~]$ diff spark-2.0.1-bin-hadoop2.7/conf/log4j.properties.template spark-2.0.1-bin-hadoop2.7/conf/log4j.properties
19c19
< log4j.rootCategory=INFO, console
---
> log4j.rootCategory=WARN, console
spark-sql> select * from kv where value > 2;
Error in query: Table or view not found: kv; line 1 pos 14
spark-sql> create TEMPORARY TABLE kv USING org.apache.spark.sql.cassandra OPTIONS (table "kv",keyspace "mykeyspace", cluster "Test Cluster",pushdown "true");
16/10/12 08:28:09 WARN SparkStrategies$DDLStrategy: CREATE TEMPORARY TABLE kv USING... is deprecated, please use CREATE TEMPORARY VIEW viewName USING... instead
Time taken: 4.008 seconds
spark-sql> select * from kv;
key1 1
key4 4
key3 3
key2 2
Time taken: 2.253 seconds, Fetched 4 row(s)
spark-sql> select substring(key,1,3) from kv;
key
key
key
key
Time taken: 1.328 seconds, Fetched 4 row(s)
spark-sql> select substring(key,1,3),count(*) from kv group by substring(key,1,3);
key 4
Time taken: 3.518 seconds, Fetched 1 row(s)
spark-sql>
Subscribe to:
Posts (Atom)