Sunday, January 28, 2018

Convert Excel into CSV using pandas

Donghuas-MacBook-Air:Downloads donghua$ python
Python 3.6.3 |Anaconda, Inc.| (default, Oct  6 2017, 12:04:38) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> df = pd.read_excel("/Users/donghua/Downloads/LN University.xls",sheet_name="Sheet1",header=None, skiprows=3)
>>> df.head(1)
   0       1             2            3       4    5       6   7   \
0   1  鞍山师范学院  201310169001  花楸果实中花青素的提取  创新训练项目  侯文锋  110604   4   

                                          8   9   10     11    12     13   14  \
0  杨晓龙 110607 \n王博  110505 \n陈中意 110629       辛广  教授  15000  5000  10000  550   

                                                  15  
0  本项目以花楸为原材料,通过用表面活性剂结合酸化的常规提取剂辅助超声波法提取花楸果实中花青素,...  
>>> df.to_csv("/Users/donghua/Downloads/LN_University_20180125.csv",sep='\t',header=False, encoding='utf-8') 
>>> exit()

Donghuas-MacBook-Air:Downloads donghua$ 


import pandas as pd
import numpy as np
df = pd.read_excel("/Users/donghua/Downloads/LN University.xls",sheet_name="Sheet1",header=None, skiprows=3)
#replace at column level
#df[8]=df[8].str.replace(r'\n',' ');
#df[15]=df[15].str.replace(r'\n',' ');
#replace within whole data frame
df=df.replace({r'\n': ' '}, regex=True)
df.to_csv("/Users/donghua/Downloads/LN_University_20180125.csv",sep=',',line_terminator='\n',escapechar='\\', header=False, encoding='utf-8')
exit()





Saturday, January 27, 2018

Use sqoop export to move data from HDFS into MySQL


MariaDB [employees]> create table current_dept_emp2 as  select * from current_dept_emp where 1=2;
Query OK, 0 rows affected (0.05 sec)
Records: 0  Duplicates: 0  Warnings: 0

[donghua@cdh-vm ~]$ sqoop export --connect jdbc:mysql://cdh-vm.dbaglobe.com/employees --username employee_user --password password --table current_dept_emp2  --export-dir /user/donghua/current_dept_emp
Warning: /opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/01/27 05:43:54 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.13.1
18/01/27 05:43:54 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/01/27 05:43:55 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
18/01/27 05:43:55 INFO tool.CodeGenTool: Beginning code generation
18/01/27 05:43:55 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `current_dept_emp2` AS t LIMIT 1
18/01/27 05:43:55 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `current_dept_emp2` AS t LIMIT 1
18/01/27 05:43:55 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce
Note: /tmp/sqoop-donghua/compile/4eb832477301808137f8d255765ba2ca/current_dept_emp2.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
18/01/27 05:43:56 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-donghua/compile/4eb832477301808137f8d255765ba2ca/current_dept_emp2.jar
18/01/27 05:43:56 INFO mapreduce.ExportJobBase: Beginning export of current_dept_emp2
18/01/27 05:43:57 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
18/01/27 05:43:58 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
18/01/27 05:43:58 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
18/01/27 05:43:58 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
18/01/27 05:43:58 INFO client.RMProxy: Connecting to ResourceManager at cdh-vm.dbaglobe.com/192.168.56.10:8032
18/01/27 05:44:00 INFO input.FileInputFormat: Total input paths to process : 1
18/01/27 05:44:00 INFO input.FileInputFormat: Total input paths to process : 1
18/01/27 05:44:00 INFO mapreduce.JobSubmitter: number of splits:4
18/01/27 05:44:01 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1517023991003_0018
18/01/27 05:44:01 INFO impl.YarnClientImpl: Submitted application application_1517023991003_0018
18/01/27 05:44:01 INFO mapreduce.Job: The url to track the job: http://cdh-vm.dbaglobe.com:8088/proxy/application_1517023991003_0018/
18/01/27 05:44:01 INFO mapreduce.Job: Running job: job_1517023991003_0018
18/01/27 05:44:08 INFO mapreduce.Job: Job job_1517023991003_0018 running in uber mode : false
18/01/27 05:44:08 INFO mapreduce.Job:  map 0% reduce 0%
18/01/27 05:44:16 INFO mapreduce.Job:  map 25% reduce 0%
18/01/27 05:44:22 INFO mapreduce.Job:  map 50% reduce 0%
18/01/27 05:44:28 INFO mapreduce.Job:  map 75% reduce 0%
18/01/27 05:44:34 INFO mapreduce.Job:  map 100% reduce 0%
18/01/27 05:44:35 INFO mapreduce.Job: Job job_1517023991003_0018 completed successfully
18/01/27 05:44:35 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=695328
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=10241715
HDFS: Number of bytes written=0
HDFS: Number of read operations=19
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
Job Counters 
Launched map tasks=4
Data-local map tasks=4
Total time spent by all maps in occupied slots (ms)=20479
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=20479
Total vcore-milliseconds taken by all map tasks=20479
Total megabyte-milliseconds taken by all map tasks=31455744
Map-Reduce Framework
Map input records=300024
Map output records=300024
Input split bytes=711
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=332
CPU time spent (ms)=15020
Physical memory (bytes) snapshot=1057984512
Virtual memory (bytes) snapshot=11192446976
Total committed heap usage (bytes)=862453760
File Input Format Counters 
Bytes Read=0
File Output Format Counters 
Bytes Written=0
18/01/27 05:44:35 INFO mapreduce.ExportJobBase: Transferred 9.7673 MB in 37.4601 seconds (266.9952 KB/sec)
18/01/27 05:44:35 INFO mapreduce.ExportJobBase: Exported 300024 records.

MariaDB [employees]> select count(*) from current_dept_emp2;
+----------+
| count(*) |
+----------+
|   300024 |
+----------+
1 row in set (0.09 sec)

Use Sqoop to perform incremental data loading (--incremental=lastmodified for timestamp column)

[donghua@cdh-vm ~]$ hdfs dfs -cat /user/donghua/t2/part-m-00000
1,2018-01-27 04:50:07.0
2,2018-01-27 04:50:18.0
[donghua@cdh-vm ~]$ 

MariaDB [employees]> insert into t2 values(3,current_timestamp());
Query OK, 1 row affected (0.01 sec)

MariaDB [employees]> update t2 set last_updated_at=current_timestamp() where id=2;
Query OK, 1 row affected (0.07 sec)
Rows matched: 1  Changed: 1  Warnings: 0

MariaDB [employees]> select * from t2;
+----+---------------------+
| id | last_updated_at     |
+----+---------------------+
|  1 | 2018-01-27 04:50:07 |
|  2 | 2018-01-27 05:10:14 |
|  3 | 2018-01-27 05:09:45 |
+----+---------------------+
3 rows in set (0.00 sec)

MariaDB [employees]> 


[donghua@cdh-vm ~]$ sqoop import --connect jdbc:mysql://cdh-vm.dbaglobe.com/employees  --username employee_user --password password --table t2  -m 1 --target-dir=/user/donghua/t2 --check-column=last_updated_at --incremental=lastmodified --last-value='2018-01-27 05:06:03.0'
Warning: /opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/01/27 05:11:59 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.13.1
18/01/27 05:11:59 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/01/27 05:12:00 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
18/01/27 05:12:00 INFO tool.CodeGenTool: Beginning code generation
18/01/27 05:12:00 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t2` AS t LIMIT 1
18/01/27 05:12:00 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t2` AS t LIMIT 1
18/01/27 05:12:00 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce
Note: /tmp/sqoop-donghua/compile/631be22fe0124698ede97beba0c8288e/t2.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
18/01/27 05:12:01 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-donghua/compile/631be22fe0124698ede97beba0c8288e/t2.jar
18/01/27 05:12:02 ERROR tool.ImportTool: Import failed: --merge-key or --append is required when using --incremental lastmodified and the output directory exists.

[donghua@cdh-vm ~]$ sqoop import --connect jdbc:mysql://cdh-vm.dbaglobe.com/employees  --username employee_user --password password --table t2  -m 1 --target-dir=/user/donghua/t2 --check-column=last_updated_at --incremental=lastmodified --last-value='2018-01-27 05:06:03.0' --merge-key=id
Warning: /opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/01/27 05:12:40 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.13.1
18/01/27 05:12:40 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/01/27 05:12:40 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
18/01/27 05:12:40 INFO tool.CodeGenTool: Beginning code generation
18/01/27 05:12:41 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t2` AS t LIMIT 1
18/01/27 05:12:41 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t2` AS t LIMIT 1
18/01/27 05:12:41 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce
Note: /tmp/sqoop-donghua/compile/2e4f90897221b505b822c323c3cb2b41/t2.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
18/01/27 05:12:42 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-donghua/compile/2e4f90897221b505b822c323c3cb2b41/t2.jar
18/01/27 05:12:43 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t2` AS t LIMIT 1
18/01/27 05:12:43 INFO tool.ImportTool: Incremental import based on column `last_updated_at`
18/01/27 05:12:43 INFO tool.ImportTool: Lower bound value: '2018-01-27 05:06:03.0'
18/01/27 05:12:43 INFO tool.ImportTool: Upper bound value: '2018-01-27 05:12:43.0'
18/01/27 05:12:43 WARN manager.MySQLManager: It looks like you are importing from mysql.
18/01/27 05:12:43 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
18/01/27 05:12:43 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
18/01/27 05:12:43 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
18/01/27 05:12:43 INFO mapreduce.ImportJobBase: Beginning import of t2
18/01/27 05:12:43 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
18/01/27 05:12:43 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
18/01/27 05:12:43 INFO client.RMProxy: Connecting to ResourceManager at cdh-vm.dbaglobe.com/192.168.56.10:8032
18/01/27 05:12:47 INFO db.DBInputFormat: Using read commited transaction isolation
18/01/27 05:12:47 INFO mapreduce.JobSubmitter: number of splits:1
18/01/27 05:12:48 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1517023991003_0016
18/01/27 05:12:48 INFO impl.YarnClientImpl: Submitted application application_1517023991003_0016
18/01/27 05:12:48 INFO mapreduce.Job: The url to track the job: http://cdh-vm.dbaglobe.com:8088/proxy/application_1517023991003_0016/
18/01/27 05:12:48 INFO mapreduce.Job: Running job: job_1517023991003_0016
18/01/27 05:12:55 INFO mapreduce.Job: Job job_1517023991003_0016 running in uber mode : false
18/01/27 05:12:55 INFO mapreduce.Job:  map 0% reduce 0%
18/01/27 05:13:01 INFO mapreduce.Job:  map 100% reduce 0%
18/01/27 05:13:02 INFO mapreduce.Job: Job job_1517023991003_0016 completed successfully
18/01/27 05:13:02 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=175177
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=87
HDFS: Number of bytes written=48
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters 
Launched map tasks=1
Other local map tasks=1
Total time spent by all maps in occupied slots (ms)=4073
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=4073
Total vcore-milliseconds taken by all map tasks=4073
Total megabyte-milliseconds taken by all map tasks=6256128
Map-Reduce Framework
Map input records=2
Map output records=2
Input split bytes=87
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=61
CPU time spent (ms)=920
Physical memory (bytes) snapshot=196935680
Virtual memory (bytes) snapshot=2785828864
Total committed heap usage (bytes)=155713536
File Input Format Counters 
Bytes Read=0
File Output Format Counters 
Bytes Written=48
18/01/27 05:13:02 INFO mapreduce.ImportJobBase: Transferred 48 bytes in 19.1241 seconds (2.5099 bytes/sec)
18/01/27 05:13:02 INFO mapreduce.ImportJobBase: Retrieved 2 records.
18/01/27 05:13:02 INFO tool.ImportTool: Final destination exists, will run merge job.
18/01/27 05:13:02 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
18/01/27 05:13:02 INFO client.RMProxy: Connecting to ResourceManager at cdh-vm.dbaglobe.com/192.168.56.10:8032
18/01/27 05:13:07 INFO input.FileInputFormat: Total input paths to process : 2
18/01/27 05:13:07 INFO mapreduce.JobSubmitter: number of splits:2
18/01/27 05:13:07 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1517023991003_0017
18/01/27 05:13:08 INFO impl.YarnClientImpl: Submitted application application_1517023991003_0017
18/01/27 05:13:08 INFO mapreduce.Job: The url to track the job: http://cdh-vm.dbaglobe.com:8088/proxy/application_1517023991003_0017/
18/01/27 05:13:08 INFO mapreduce.Job: Running job: job_1517023991003_0017
18/01/27 05:13:15 INFO mapreduce.Job: Job job_1517023991003_0017 running in uber mode : false
18/01/27 05:13:15 INFO mapreduce.Job:  map 0% reduce 0%
18/01/27 05:13:20 INFO mapreduce.Job:  map 50% reduce 0%
18/01/27 05:13:24 INFO mapreduce.Job:  map 100% reduce 0%
18/01/27 05:13:31 INFO mapreduce.Job:  map 100% reduce 100%
18/01/27 05:13:31 INFO mapreduce.Job: Job job_1517023991003_0017 completed successfully
18/01/27 05:13:31 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=90
FILE: Number of bytes written=526653
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=386
HDFS: Number of bytes written=72
HDFS: Number of read operations=9
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters 
Launched map tasks=2
Launched reduce tasks=1
Data-local map tasks=2
Total time spent by all maps in occupied slots (ms)=6496
Total time spent by all reduces in occupied slots (ms)=3317
Total time spent by all map tasks (ms)=6496
Total time spent by all reduce tasks (ms)=3317
Total vcore-milliseconds taken by all map tasks=6496
Total vcore-milliseconds taken by all reduce tasks=3317
Total megabyte-milliseconds taken by all map tasks=9977856
Total megabyte-milliseconds taken by all reduce tasks=5094912
Map-Reduce Framework
Map input records=4
Map output records=4
Map output bytes=96
Map output materialized bytes=122
Input split bytes=290
Combine input records=0
Combine output records=0
Reduce input groups=3
Reduce shuffle bytes=122
Reduce input records=4
Reduce output records=3
Spilled Records=8
Shuffled Maps =2
Failed Shuffles=0
Merged Map outputs=2
GC time elapsed (ms)=211
CPU time spent (ms)=1900
Physical memory (bytes) snapshot=1147371520
Virtual memory (bytes) snapshot=8375828480
Total committed heap usage (bytes)=1154482176
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters 
Bytes Read=96
File Output Format Counters 
Bytes Written=72
18/01/27 05:13:31 INFO tool.ImportTool: Incremental import complete! To run another incremental import of all data following this import, supply the following arguments:
18/01/27 05:13:31 INFO tool.ImportTool:  --incremental lastmodified
18/01/27 05:13:31 INFO tool.ImportTool:   --check-column last_updated_at
18/01/27 05:13:31 INFO tool.ImportTool:   --last-value 2018-01-27 05:12:43.0
18/01/27 05:13:31 INFO tool.ImportTool: (Consider saving this with 'sqoop job --create')
[donghua@cdh-vm ~]$ hdfs dfs -ls /user/donghua/t2/
Found 2 items
-rw-r--r--   1 donghua supergroup          0 2018-01-27 05:13 /user/donghua/t2/_SUCCESS
-rw-r--r--   1 donghua supergroup         72 2018-01-27 05:13 /user/donghua/t2/part-r-00000
[donghua@cdh-vm ~]$ hdfs dfs -cat /user/donghua/t2/part-r-00000
1,2018-01-27 04:50:07.0
2,2018-01-27 05:10:14.0
3,2018-01-27 05:09:45.0
[donghua@cdh-vm ~]$ 

Use sqoop to perform incremental import (--check-column=id --incremental=append --last-value=)


[donghua@cdh-vm ~]$ mysql -u employee_user -ppassword -D employees

MariaDB [employees]> create table t1 (id int primary key, c1 varchar(10));

MariaDB [employees]> insert into t1 values(1,'a'),(2,'b');

MariaDB [employees]> select * from t1;
+----+------+
| id | c1   |
+----+------+
|  1 | a    |
|  2 | b    |
+----+------+
2 rows in set (0.00 sec)

[donghua@cdh-vm ~]$ beeline -u jdbc:hive2://localhost:10000/default -n donghua --silent=true
0: jdbc:hive2://localhost:10000/default> create table employees.t1(id int, c1 varchar(10));


[donghua@cdh-vm ~]$ sqoop import --connect jdbc:mysql://cdh-vm.dbaglobe.com/employees \
>  --username employee_user --password password --table t1 \
>  --split-by=id --hive-import --hive-table=employees.t1 \
>  --warehouse-dir=/user/hive/warehouse \
>  --check-column=id --incremental=append --last-value=0
Warning: /opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/01/27 03:54:38 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.13.1
18/01/27 03:54:38 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/01/27 03:54:38 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
18/01/27 03:54:38 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
18/01/27 03:54:38 WARN tool.BaseSqoopTool: It seems that you're doing hive import directly into default
18/01/27 03:54:38 WARN tool.BaseSqoopTool: hive warehouse directory which is not supported. Sqoop is
18/01/27 03:54:38 WARN tool.BaseSqoopTool: firstly importing data into separate directory and then
18/01/27 03:54:38 WARN tool.BaseSqoopTool: inserting data into hive. Please consider removing
18/01/27 03:54:38 WARN tool.BaseSqoopTool: --target-dir or --warehouse-dir into /user/hive/warehouse in
18/01/27 03:54:38 WARN tool.BaseSqoopTool: case that you will detect any issues.
18/01/27 03:54:38 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
18/01/27 03:54:38 INFO tool.CodeGenTool: Beginning code generation
18/01/27 03:54:38 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t1` AS t LIMIT 1
18/01/27 03:54:38 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t1` AS t LIMIT 1
18/01/27 03:54:38 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce
Note: /tmp/sqoop-donghua/compile/1941b9efeafd888916e872561fa71b1d/t1.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
18/01/27 03:54:40 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-donghua/compile/1941b9efeafd888916e872561fa71b1d/t1.jar
18/01/27 03:54:41 INFO tool.ImportTool: Maximal id query for free form incremental import: SELECT MAX(`id`) FROM `t1`
18/01/27 03:54:41 INFO tool.ImportTool: Incremental import based on column `id`
18/01/27 03:54:41 INFO tool.ImportTool: Lower bound value: 0
18/01/27 03:54:41 INFO tool.ImportTool: Upper bound value: 2
18/01/27 03:54:41 WARN manager.MySQLManager: It looks like you are importing from mysql.
18/01/27 03:54:41 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
18/01/27 03:54:41 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
18/01/27 03:54:41 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
18/01/27 03:54:41 INFO mapreduce.ImportJobBase: Beginning import of t1
18/01/27 03:54:41 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
18/01/27 03:54:41 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
18/01/27 03:54:41 INFO client.RMProxy: Connecting to ResourceManager at cdh-vm.dbaglobe.com/192.168.56.10:8032
18/01/27 03:54:46 INFO db.DBInputFormat: Using read commited transaction isolation
18/01/27 03:54:46 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`id`), MAX(`id`) FROM `t1` WHERE ( `id` > 0 AND `id` <= 2 )
18/01/27 03:54:46 INFO db.IntegerSplitter: Split size: 0; Num splits: 4 from: 1 to: 2
18/01/27 03:54:46 INFO mapreduce.JobSubmitter: number of splits:2
18/01/27 03:54:47 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1517023991003_0009
18/01/27 03:54:47 INFO impl.YarnClientImpl: Submitted application application_1517023991003_0009
18/01/27 03:54:47 INFO mapreduce.Job: The url to track the job: http://cdh-vm.dbaglobe.com:8088/proxy/application_1517023991003_0009/
18/01/27 03:54:47 INFO mapreduce.Job: Running job: job_1517023991003_0009
18/01/27 03:54:54 INFO mapreduce.Job: Job job_1517023991003_0009 running in uber mode : false
18/01/27 03:54:54 INFO mapreduce.Job:  map 0% reduce 0%
18/01/27 03:55:02 INFO mapreduce.Job:  map 50% reduce 0%
18/01/27 03:55:06 INFO mapreduce.Job:  map 100% reduce 0%
18/01/27 03:55:07 INFO mapreduce.Job: Job job_1517023991003_0009 completed successfully
18/01/27 03:55:07 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=350308
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=197
HDFS: Number of bytes written=8
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=4
Job Counters 
Launched map tasks=2
Other local map tasks=2
Total time spent by all maps in occupied slots (ms)=7843
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=7843
Total vcore-milliseconds taken by all map tasks=7843
Total megabyte-milliseconds taken by all map tasks=12046848
Map-Reduce Framework
Map input records=2
Map output records=2
Input split bytes=197
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=110
CPU time spent (ms)=1970
Physical memory (bytes) snapshot=413765632
Virtual memory (bytes) snapshot=5572857856
Total committed heap usage (bytes)=402653184
File Input Format Counters 
Bytes Read=0
File Output Format Counters 
Bytes Written=8
18/01/27 03:55:07 INFO mapreduce.ImportJobBase: Transferred 8 bytes in 26.2002 seconds (0.3053 bytes/sec)
18/01/27 03:55:07 INFO mapreduce.ImportJobBase: Retrieved 2 records.
18/01/27 03:55:07 INFO util.AppendUtils: Creating missing output directory - t1
18/01/27 03:55:07 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t1` AS t LIMIT 1
18/01/27 03:55:07 INFO hive.HiveImport: Loading uploaded data into Hive

Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/jars/hive-common-1.1.0-cdh5.13.1.jar!/hive-log4j.properties
OK
Time taken: 2.037 seconds
Loading data to table employees.t1
Table employees.t1 stats: [numFiles=2, totalSize=8]
OK
Time taken: 0.646 seconds

0: jdbc:hive2://localhost:10000/default> select * from employees.t1;
+--------+--------+--+
| t1.id  | t1.c1  |
+--------+--------+--+
| 1      | a      |
| 2      | b      |
+--------+--------+--+


MariaDB [employees]> insert into t1 values(3,'a'),(4,'b');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0


[donghua@cdh-vm ~]$ sqoop import --connect jdbc:mysql://cdh-vm.dbaglobe.com/employees  --username employee_user --password password --table t1  --split-by=id --hive-import --hive-table=employees.t1  --warehouse-dir=/user/hive/warehouse  --check-column=id --incremental=append --last-value=2
Warning: /opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/01/27 04:11:31 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.13.1
18/01/27 04:11:31 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/01/27 04:11:31 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
18/01/27 04:11:31 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
18/01/27 04:11:31 WARN tool.BaseSqoopTool: It seems that you're doing hive import directly into default
18/01/27 04:11:31 WARN tool.BaseSqoopTool: hive warehouse directory which is not supported. Sqoop is
18/01/27 04:11:31 WARN tool.BaseSqoopTool: firstly importing data into separate directory and then
18/01/27 04:11:31 WARN tool.BaseSqoopTool: inserting data into hive. Please consider removing
18/01/27 04:11:31 WARN tool.BaseSqoopTool: --target-dir or --warehouse-dir into /user/hive/warehouse in
18/01/27 04:11:31 WARN tool.BaseSqoopTool: case that you will detect any issues.
18/01/27 04:11:31 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
18/01/27 04:11:31 INFO tool.CodeGenTool: Beginning code generation
18/01/27 04:11:31 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t1` AS t LIMIT 1
18/01/27 04:11:31 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t1` AS t LIMIT 1
18/01/27 04:11:31 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce
Note: /tmp/sqoop-donghua/compile/80c2f1f6c1f1b6c4b9fca928aa6353a8/t1.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
18/01/27 04:11:33 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-donghua/compile/80c2f1f6c1f1b6c4b9fca928aa6353a8/t1.jar
18/01/27 04:11:34 INFO tool.ImportTool: Maximal id query for free form incremental import: SELECT MAX(`id`) FROM `t1`
18/01/27 04:11:34 INFO tool.ImportTool: Incremental import based on column `id`
18/01/27 04:11:34 INFO tool.ImportTool: Lower bound value: 2
18/01/27 04:11:34 INFO tool.ImportTool: Upper bound value: 4
18/01/27 04:11:34 WARN manager.MySQLManager: It looks like you are importing from mysql.
18/01/27 04:11:34 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
18/01/27 04:11:34 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
18/01/27 04:11:34 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
18/01/27 04:11:34 INFO mapreduce.ImportJobBase: Beginning import of t1
18/01/27 04:11:34 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
18/01/27 04:11:34 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
18/01/27 04:11:34 INFO client.RMProxy: Connecting to ResourceManager at cdh-vm.dbaglobe.com/192.168.56.10:8032
18/01/27 04:11:38 INFO db.DBInputFormat: Using read commited transaction isolation
18/01/27 04:11:38 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`id`), MAX(`id`) FROM `t1` WHERE ( `id` > 2 AND `id` <= 4 )
18/01/27 04:11:38 INFO db.IntegerSplitter: Split size: 0; Num splits: 4 from: 3 to: 4
18/01/27 04:11:38 INFO mapreduce.JobSubmitter: number of splits:2
18/01/27 04:11:38 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1517023991003_0012
18/01/27 04:11:38 INFO impl.YarnClientImpl: Submitted application application_1517023991003_0012
18/01/27 04:11:38 INFO mapreduce.Job: The url to track the job: http://cdh-vm.dbaglobe.com:8088/proxy/application_1517023991003_0012/
18/01/27 04:11:38 INFO mapreduce.Job: Running job: job_1517023991003_0012
18/01/27 04:11:45 INFO mapreduce.Job: Job job_1517023991003_0012 running in uber mode : false
18/01/27 04:11:45 INFO mapreduce.Job:  map 0% reduce 0%
18/01/27 04:11:51 INFO mapreduce.Job:  map 50% reduce 0%
18/01/27 04:11:57 INFO mapreduce.Job:  map 100% reduce 0%
18/01/27 04:11:57 INFO mapreduce.Job: Job job_1517023991003_0012 completed successfully
18/01/27 04:11:57 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=350308
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=197
HDFS: Number of bytes written=8
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=4
Job Counters 
Launched map tasks=2
Other local map tasks=2
Total time spent by all maps in occupied slots (ms)=7531
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=7531
Total vcore-milliseconds taken by all map tasks=7531
Total megabyte-milliseconds taken by all map tasks=11567616
Map-Reduce Framework
Map input records=2
Map output records=2
Input split bytes=197
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=114
CPU time spent (ms)=1800
Physical memory (bytes) snapshot=403120128
Virtual memory (bytes) snapshot=5573816320
Total committed heap usage (bytes)=359137280
File Input Format Counters 
Bytes Read=0
File Output Format Counters 
Bytes Written=8
18/01/27 04:11:57 INFO mapreduce.ImportJobBase: Transferred 8 bytes in 23.359 seconds (0.3425 bytes/sec)
18/01/27 04:11:57 INFO mapreduce.ImportJobBase: Retrieved 2 records.
18/01/27 04:11:57 INFO util.AppendUtils: Creating missing output directory - t1
18/01/27 04:11:57 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t1` AS t LIMIT 1
18/01/27 04:11:57 INFO hive.HiveImport: Loading uploaded data into Hive

Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/jars/hive-common-1.1.0-cdh5.13.1.jar!/hive-log4j.properties
OK
Time taken: 1.853 seconds
Loading data to table employees.t1
Table employees.t1 stats: [numFiles=4, numRows=0, totalSize=16, rawDataSize=0]
OK
Time taken: 0.603 seconds

0: jdbc:hive2://localhost:10000/default> select * from employees.t1;
+--------+--------+--+
| t1.id  | t1.c1  |
+--------+--------+--+
| 1      | a      |
| 3      | a      |
| 2      | b      |
| 4      | b      |
+--------+--------+--+

[donghua@cdh-vm ~]$ sqoop job --create emp_t1_incr  -- import --connect jdbc:mysql://cdh-vm.dbaglobe.com/employees  --username employee_user --password password --table t1  --split-by=id --hive-import --hive-table=employees.t1  --warehouse-dir=/user/hive/warehouse  --check-column=id --incremental=append --last-value=4
Warning: /opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/01/27 04:21:32 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.13.1
18/01/27 04:21:32 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/01/27 04:21:32 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
18/01/27 04:21:32 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
18/01/27 04:21:32 WARN tool.BaseSqoopTool: It seems that you're doing hive import directly into default
18/01/27 04:21:32 WARN tool.BaseSqoopTool: hive warehouse directory which is not supported. Sqoop is
18/01/27 04:21:32 WARN tool.BaseSqoopTool: firstly importing data into separate directory and then
18/01/27 04:21:32 WARN tool.BaseSqoopTool: inserting data into hive. Please consider removing
18/01/27 04:21:32 WARN tool.BaseSqoopTool: --target-dir or --warehouse-dir into /user/hive/warehouse in
18/01/27 04:21:32 WARN tool.BaseSqoopTool: case that you will detect any issues.

[donghua@cdh-vm ~]$ sqoop job --list
Warning: /opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/01/27 04:30:13 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.13.1
Available jobs:
  emp_t1_incr

[donghua@cdh-vm ~]$ sqoop job --show emp_t1_incr
Warning: /opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/01/27 04:35:40 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.13.1
Enter password: 
Job: emp_t1_incr
Tool: import
Options:
----------------------------
verbose = false
hcatalog.drop.and.create.table = false
incremental.last.value = 4
db.connect.string = jdbc:mysql://cdh-vm.dbaglobe.com/employees
codegen.output.delimiters.escape = 0
codegen.output.delimiters.enclose.required = false
codegen.input.delimiters.field = 0
mainframe.input.dataset.type = p
split.limit = null
hbase.create.table = false
db.require.password = true
hdfs.append.dir = true
db.table = t1
codegen.input.delimiters.escape = 0
accumulo.create.table = false
import.fetch.size = null
codegen.input.delimiters.enclose.required = false
db.username = employee_user
reset.onemapper = false
codegen.output.delimiters.record = 10
import.max.inline.lob.size = 16777216
sqoop.throwOnError = false
hbase.bulk.load.enabled = false
hcatalog.create.table = false
db.clear.staging.table = false
incremental.col = id
codegen.input.delimiters.record = 0
hdfs.warehouse.dir = /user/hive/warehouse
enable.compression = false
hive.overwrite.table = false
hive.import = true
codegen.input.delimiters.enclose = 0
hive.table.name = employees.t1
accumulo.batch.size = 10240000
hive.drop.delims = false
customtool.options.jsonmap = {}
codegen.output.delimiters.enclose = 0
hdfs.delete-target.dir = false
codegen.output.dir = .
codegen.auto.compile.dir = true
relaxed.isolation = false
mapreduce.num.mappers = 4
accumulo.max.latency = 5000
import.direct.split.size = 0
sqlconnection.metadata.transaction.isolation.level = 2
codegen.output.delimiters.field = 1
export.new.update = UpdateOnly
incremental.mode = AppendRows
hdfs.file.format = TextFile
sqoop.oracle.escaping.disabled = true
codegen.compile.dir = /tmp/sqoop-donghua/compile/e7212eb92686a1486fa1cd44a6c9afc7
direct.import = false
temporary.dirRoot = _sqoop
db.split.column = id
hive.fail.table.exists = false
db.batch = false


[donghua@cdh-vm ~]$ sqoop job --exec emp_t1_incr
Warning: /opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/01/27 04:38:01 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.13.1
Enter password: 
18/01/27 04:38:06 WARN tool.BaseSqoopTool: It seems that you're doing hive import directly into default
18/01/27 04:38:06 WARN tool.BaseSqoopTool: hive warehouse directory which is not supported. Sqoop is
18/01/27 04:38:06 WARN tool.BaseSqoopTool: firstly importing data into separate directory and then
18/01/27 04:38:06 WARN tool.BaseSqoopTool: inserting data into hive. Please consider removing
18/01/27 04:38:06 WARN tool.BaseSqoopTool: --target-dir or --warehouse-dir into /user/hive/warehouse in
18/01/27 04:38:06 WARN tool.BaseSqoopTool: case that you will detect any issues.
18/01/27 04:38:06 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
18/01/27 04:38:06 INFO tool.CodeGenTool: Beginning code generation
18/01/27 04:38:07 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t1` AS t LIMIT 1
18/01/27 04:38:07 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t1` AS t LIMIT 1
18/01/27 04:38:07 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce
Note: /tmp/sqoop-donghua/compile/e3c397f1a5469f870ba19e95b80a66a2/t1.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
18/01/27 04:38:08 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-donghua/compile/e3c397f1a5469f870ba19e95b80a66a2/t1.jar
18/01/27 04:38:09 INFO tool.ImportTool: Maximal id query for free form incremental import: SELECT MAX(`id`) FROM `t1`
18/01/27 04:38:09 INFO tool.ImportTool: Incremental import based on column `id`
18/01/27 04:38:09 INFO tool.ImportTool: Lower bound value: 4
18/01/27 04:38:09 INFO tool.ImportTool: Upper bound value: 6
18/01/27 04:38:09 WARN manager.MySQLManager: It looks like you are importing from mysql.
18/01/27 04:38:09 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
18/01/27 04:38:09 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
18/01/27 04:38:09 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
18/01/27 04:38:09 INFO mapreduce.ImportJobBase: Beginning import of t1
18/01/27 04:38:09 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
18/01/27 04:38:09 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
18/01/27 04:38:09 INFO client.RMProxy: Connecting to ResourceManager at cdh-vm.dbaglobe.com/192.168.56.10:8032
18/01/27 04:38:12 INFO db.DBInputFormat: Using read commited transaction isolation
18/01/27 04:38:12 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`id`), MAX(`id`) FROM `t1` WHERE ( `id` > 4 AND `id` <= 6 )
18/01/27 04:38:12 INFO db.IntegerSplitter: Split size: 0; Num splits: 4 from: 5 to: 6
18/01/27 04:38:12 INFO mapreduce.JobSubmitter: number of splits:2
18/01/27 04:38:12 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1517023991003_0013
18/01/27 04:38:12 INFO impl.YarnClientImpl: Submitted application application_1517023991003_0013
18/01/27 04:38:12 INFO mapreduce.Job: The url to track the job: http://cdh-vm.dbaglobe.com:8088/proxy/application_1517023991003_0013/
18/01/27 04:38:12 INFO mapreduce.Job: Running job: job_1517023991003_0013
18/01/27 04:38:20 INFO mapreduce.Job: Job job_1517023991003_0013 running in uber mode : false
18/01/27 04:38:20 INFO mapreduce.Job:  map 0% reduce 0%
18/01/27 04:38:26 INFO mapreduce.Job:  map 50% reduce 0%
18/01/27 04:38:31 INFO mapreduce.Job:  map 100% reduce 0%
18/01/27 04:38:32 INFO mapreduce.Job: Job job_1517023991003_0013 completed successfully
18/01/27 04:38:33 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=351166
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=197
HDFS: Number of bytes written=8
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=4
Job Counters 
Launched map tasks=2
Other local map tasks=2
Total time spent by all maps in occupied slots (ms)=8071
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=8071
Total vcore-milliseconds taken by all map tasks=8071
Total megabyte-milliseconds taken by all map tasks=12397056
Map-Reduce Framework
Map input records=2
Map output records=2
Input split bytes=197
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=136
CPU time spent (ms)=1820
Physical memory (bytes) snapshot=404738048
Virtual memory (bytes) snapshot=5573140480
Total committed heap usage (bytes)=354942976
File Input Format Counters 
Bytes Read=0
File Output Format Counters 
Bytes Written=8
18/01/27 04:38:33 INFO mapreduce.ImportJobBase: Transferred 8 bytes in 23.4364 seconds (0.3413 bytes/sec)
18/01/27 04:38:33 INFO mapreduce.ImportJobBase: Retrieved 2 records.
18/01/27 04:38:33 INFO util.AppendUtils: Creating missing output directory - t1
18/01/27 04:38:33 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t1` AS t LIMIT 1
18/01/27 04:38:33 INFO hive.HiveImport: Loading uploaded data into Hive

Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/jars/hive-common-1.1.0-cdh5.13.1.jar!/hive-log4j.properties
OK
Time taken: 1.95 seconds
Loading data to table employees.t1
Table employees.t1 stats: [numFiles=6, numRows=0, totalSize=24, rawDataSize=0]
OK
Time taken: 0.664 seconds

[donghua@cdh-vm ~]$ sqoop job --show emp_t1_incr
Warning: /opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/01/27 04:38:50 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.13.1
Enter password: 
Job: emp_t1_incr
Tool: import
Options:
----------------------------
verbose = false
hcatalog.drop.and.create.table = false
incremental.last.value = 6
db.connect.string = jdbc:mysql://cdh-vm.dbaglobe.com/employees
codegen.output.delimiters.escape = 0
codegen.output.delimiters.enclose.required = false
codegen.input.delimiters.field = 0
mainframe.input.dataset.type = p
split.limit = null
hbase.create.table = false
db.require.password = true
hdfs.append.dir = true
db.table = t1
codegen.input.delimiters.escape = 0
accumulo.create.table = false
import.fetch.size = null
codegen.input.delimiters.enclose.required = false
db.username = employee_user
reset.onemapper = false
codegen.output.delimiters.record = 10
import.max.inline.lob.size = 16777216
sqoop.throwOnError = false
hbase.bulk.load.enabled = false
hcatalog.create.table = false
db.clear.staging.table = false
incremental.col = id
codegen.input.delimiters.record = 0
hdfs.warehouse.dir = /user/hive/warehouse
enable.compression = false
hive.overwrite.table = false
hive.import = true
codegen.input.delimiters.enclose = 0
hive.table.name = employees.t1
accumulo.batch.size = 10240000
hive.drop.delims = false
customtool.options.jsonmap = {}
codegen.output.delimiters.enclose = 0
hdfs.delete-target.dir = false
codegen.output.dir = .
codegen.auto.compile.dir = true
relaxed.isolation = false
mapreduce.num.mappers = 4
accumulo.max.latency = 5000
import.direct.split.size = 0
sqlconnection.metadata.transaction.isolation.level = 2
codegen.output.delimiters.field = 1
export.new.update = UpdateOnly
incremental.mode = AppendRows
hdfs.file.format = TextFile
sqoop.oracle.escaping.disabled = true
codegen.compile.dir = /tmp/sqoop-donghua/compile/39496d079794ae53a008a2da9cd2ac4a
direct.import = false
temporary.dirRoot = _sqoop
db.split.column = id
hive.fail.table.exists = false
db.batch = false