澳门新萄京官方网站-www.8455.com-澳门新萄京赌场网址

DB设计工具,基于Xtrabackup恢复单个innodb表

2019-08-24 作者:数据库网络   |   浏览(84)

 

 

DBA Morning Check List

1/列出mysql数据库中的全数数据库sqoop list-databases -connect jdbc:mysql://localhost:3306/ -username root -password 1234562/连接mysql并  列出test数据库中的表sqoop list-tables -connect jdbc:mysql://localhost:3306/test -username root -password 1234563/将关系型数据的表结构 复制到hive中,只是复制表结构  内容不复制sqoop create-hive-table -connect jdbc:mysql://localhost:3306/test -table sqoop_testTabinMySql -username root -password 123456 -hive-table testNewTabInHive4/从关周密据库导入文本到hive中sqoop import -connect jdbc:mysql://localhost:3306/zxtest -username root -password 123456 -table sqoop_test -hive-import -hive-table s_test -m 15/将hive中的表数据导入到mysql 中,在开展导入前 mysql中的表hive_test 必得超前创设好sqoop export -connect jdbc:mysql://localhost:3306/zxtest -username root -password root -table hive_test -export-dir /user/hive/warehouse/new_test_partition/dt=二零一三-03-056/从数额库导出表的多寡到HDFS上的文本sqoop import -connect jdbc:mysql://localhost:3306/compression -username=hadoop -password=123456 -table HADOOP_USER_INFO -m 1 -target -dir /user/test7/数据库增量导入表数据到hdfs中sqoop import -connect jdbc:mysql://localhost:3306/compression -username=hadoop -password=123456 -table HADOOP_USER_INFO -m 1 -target -dir /user/test -check -column id -incremental append -last-value 3Importsqoop 数据导入具备以下特征:1.援救文件文件(--as-textfile)、avro(--as-avrodatafile)、SequenceFiles(--as-sequencefile)。 RCFILE暂未帮忙,默感觉文本2.帮衬数据追加,通过--apend钦赐3.支撑table列选用(--column),支持数据选用(--where),和--table一同利用4.支撑数据选用,举个例子读入多表join后的数码'SELECT a.*, b.* FROM a JOIN b on (a.id == b.id) ‘,不可能和--table同时使用5.协助map数定制(-m)6.支撑压缩(--compress)7.支撑将关周详据库中的数据导入到Hive(--hive-import)、HBase(--hbase-table)  数据导入Hive分三步:1)导入数据到HDFS  2)Hive建表  3)使用“LOAD DATA INPAHT”将数据LOAD到表中  数据导入HBase分二部:1)导入数据到HDFS 2)调用HBase put操作逐行将数据写入表*import是将关周全据库迁移到HDFS上  暗中同意目录是/user/${user.name}/${tablename},能够透过--target-dir设置hdfs上的指标目录。export是import的反向进程,将hdfs上的数目导入到关周到据库中  由于sqoop是通过map实现多少的导入,各种map进程是单独的,没有东西的定义,大概会有一对map数据导入退步的境况。为了化解这一主题素材,sqoop中有三个折中的办法,便是钦定中间 staging表,成功后再由中间表导入到结果表。这一效用是透过 --staging-table内定,同有时候staging表结构也是急需超前创设出来的:sqoop export --connect jdbc:mysql://192.168.81.176/sqoop --username root -password passwd --table sds --export-dir /user/guojian/sds --staging-table sds_tmp需求证实的是,在利用 --direct, --update-key或许--call存款和储蓄进度的选项时,staging中间表是不可用的。create-hive-table将关周详据库表导入到hive表中参数表明–hive-homeHive的设置目录,能够经过该参数覆盖掉暗中认可的hive目录–hive-overwrite覆盖掉在hive表中早已存在的数额–create-hive-table暗中认可是false,尽管目的表已经存在了,那么创制任务会破产–hive-table前面接要创设的hive表–table钦定关周密据库表名sqoop create-hive-table --connect jdbc:mysql://192.168.81.176/sqoop --username root -password passwd --table sds --hive-table sds_bak默认sds_bak是在default数据库的。这一步要求依赖HCatalog,必要先安装HCatalog,不然报如下错误:Hive history file=/tmp/guojian/hive_job_log_cfbe2de9-a358-4130-945c-b97c0add649d_1628102887.txtFAILED: ParseException line 1:44 mismatched input ')' expecting Identifier near '(' in column specificationmetastore 配置sqoop job的分享元数据音信,那样三个客户定义和实施sqoop job在这一 metastore中。暗中认可存款和储蓄在~/.sqoop运行:sqoop metastore关闭:sqoop metastore --shutdownmetastore文件的积攒地方是在 conf/sqoop-site.xml中 sqoop.metastore.server.location 配置,指向当和姑件。metastore可以经过TCP/IP访问,端口号能够经过 sqoop.metastore.server.port配置,暗中认可是1五千。顾客端能够透过 钦定sqoop.metastore.client.autoconnect.url或利用 --meta-connect,配置为 jdbc:hsqldb:hsql://:/sqoop,举个例子jdbc:hsqldb:hsql://metaserver.example.com:1五千/sqoop。Sqoop will read entire content of the password file and use it as a password. This will include any trailing white space characters such as new line characters that are added by default by most of the text editors. You need to make sure that your password file contains only characters that belongs to your password. On the command line you can use command echo with switch -n to store password without any trailing white space characters. For example to store password secret you would call echo -n "secret" > password.file.Sqoop automatically supports several databases, including MySQL. Connect strings beginning with jdbc:mysql:// are handled automatically in Sqoop. (A full list of databases with built-in support is provided in the "Supported Databases" section. For some, you may need to install the JDBC driver yourself.)You can use Sqoop with any other JDBC-compliant database. First, download the appropriate JDBC driver for the type of database you want to import, and install the .jar file in the $SQOOP_HOME/lib directory on your client machine. (This will be /usr/lib/sqoop/lib if you installed from an RPM or Debian package.) Each driver .jar file also has a specific driver class which defines the entry-point to the driver. For example, MySQL’s Connector/J library has a driver class of com.mysql.jdbc.Driver. Refer to your database vendor-specific documentation to determine the main driver class. This class must be provided as an argument to Sqoop with --driver.For example, to connect to a SQLServer database, first download the driver from microsoft.com and install it in your Sqoop lib path.Sqoop can also import the result set of an arbitrary SQL query. Instead of using the --table, --columns and --where arguments, you can specify a SQL statement with the --query argument.When importing a free-form query, you must specify a destination directory with --target-dir.NoteIf you are issuing the query wrapped with double quotes ("), you will have to use $CONDITIONS instead of just $CONDITIONS to disallow your shell from treating it as a shell variable. For example, a double quoted query may look like: "SELECT * FROM x WHERE a='foo' AND $CONDITIONS"The facility of using free-form query in the current version of Sqoop is limited to simple queries where there are no ambiguous projections and no OPRADO conditions in the WHERE clause. Use of complex queries such as queries that have sub-queries or joins leading to ambiguous projections can lead to unexpected results.  即  where中不可能有orSqoop imports data in parallel from most database sources. You can specify the number of map tasks (parallel processes) to use to perform the import by using the -m or --num-mappers argument. Each of these arguments takes an integer value which corresponds to the degree of parallelism to employ. By default, four tasks are used. Some databases may see improved performance by increasing this value to 8 or 16. 默许开启4个taskWhen performing parallel imports, Sqoop needs a criterion by which it can split the workload. Sqoop uses a splitting column to split the workload. By default, Sqoop will identify the primary key column (if present) in a table and use it as the splitting column. The low and high values for the splitting column are retrieved from the database, and the map tasks operate on evenly-sized components of the total range. For example, if you had a table with a primary key column of id whose minimum value was 0 and maximum value was 1000, and Sqoop was directed to use 4 tasks, Sqoop would run four processes which each execute SQL statements of the form SELECT * FROM sometable WHERE id >= lo AND id < hi, with (lo, hi) set to (0, 250), (250, 500), (500, 750), and (750, 1001) in the different tasks.Sqoop cannot currently split on multi-column indices. If your table has no index column, or has a multi-column key, then you must also manually choose a splitting column.If a table does not have a primary key defined and the --split-by  is not provided, then import will fail unless the number of mappers is explicitly set to one with the --num-mappers 1 option or the --autoreset-to-one-mapper option is used. The option --autoreset-to-one-mapper is typically used with the import-all-tables tool to automatically handle tables without a primary key in a schema.Sqoop will copy the jars in $SQOOP_HOME/lib folder to job cache every time when start a Sqoop job. When launched by Oozie this is unnecessary since Oozie use its own Sqoop share lib which keeps Sqoop dependencies in the distributed cache. Oozie will do the localization on each worker node for the Sqoop dependencies only once during the first Sqoop job and reuse the jars on worker node for subsquencial jobs. Using option --skip-dist-cache in Sqoop command when launched by Oozie will skip the step which Sqoop copies its dependencies to job cache and save massive I/O.MySQL provides the mysqldump tool which can export data from MySQL to other systems very quickly. By supplying the --direct argument, you are specifying that Sqoop should attempt the direct import channel. This channel may be higher performance than using JDBC.By default, Sqoop will import a table named foo to a directory named foo inside your home directory in HDFS. For example, if your username is someuser, then the import tool will write to /user/someuser/foo/(files). You can adjust the parent directory of the import with the --warehouse-dir argument. For example:$ sqoop import --connnect--table foo --warehouse-dir /shared When using direct mode, you can specify additional arguments which should be passed to the underlying tool. If the argument -- is given on the command-line, then subsequent arguments are sent directly to the underlying tool. For example, the following adjusts the character set used by mysqldump:$ sqoop import --connect jdbc:mysql://server.foo.com/db --table bar     --direct -- --default-character-set=latin1By default, imports go to a new target location. If the destination directory already exists in HDFS, Sqoop will refuse to import and overwrite that directory’s contents. If you use the --append argument, Sqoop will import data to a temporary directory and then rename the files into the normal target directory in a manner that does not conflict with existing filenames in that directory.Sqoop is preconfigured to map most SQL types to appropriate Java or Hive representatives. However the default mapping might not be suitable for everyone and might be overridden by --map-column-java (for changing mapping to Java) or --map-column-hive (for changing Hive mapping).Sqoop is expecting comma separated list of mapping in form=. For example:

 

Preface

Preface

By Bill Richards, 2010/08/27 (first published: 2008/04/14)

Database Administrators can sometimes have one of the most stressful jobs in the company. If you have been a DBA for long, you know the scenario. You have just sat in your chair with your cup of coffee, and your phone starts ringing off the hook. The voice on the other end states that they can't pull up their data or they are getting timeouts, or the system is running slow. Okay, time to dig in; it's going to be one of those days! Is it Friday yet?

In this article, I will present ways to minimize those stressful days by having a pre-defined DBA morning checklist. A morning DBA checklist is a document of pre-defined administrative checks that are performed every morning to ensure that your server is at optimal performance. By having a standard list of items to check, you are more likely to catch and fix issues before there is a real problem.

The end result of the morning DBA checklist should have three sections. Section one contains the list of items that need checked. Section one should include checks from the following categories: performance, job failures, disk space, backups, connectivity, and anything specific to your environment, such as replication, mirroring, clustering, etc. Section two contains a place to write down issues and how they were resolved. The third section is a confirmation section where it is signed and dated. The third section is very important. Without this section, it is difficult to enforce and guarantee that these checks were performed.

The first step to create an effective morning checklist is to meet with all the DBAs and ask them these questions: 

  1. What do you check in the morning?
  2. How do you check it?
  3. What do you do when there is a problem?
  4. Is there anyone you notify in the event of a failure?

In my experience, every DBA has his own mental checklist and different ways that he / she fix issues. It is important to get a list of the items written down in a document. By combining the ideas of every DBA, you will come up with a more thorough checklist, a standardized way to fix issues, and problems are less likely to fall through the cracks.

After the DBA morning checklist is created, completed checklists should be archived in a notebook to ensure that each check was performed every day. This also serves as a history of fixes for past issues, and an audit trail for the DBA.

Since every database environment is different, and every IS shop has its own tools, every DBA's checklist will be different. The end goal is to create a checklist that is customized to your environment, in which issues can be found and fixed quickly, so that you can avoid having one of those difficult days.

With this in mind, listed below is a sample checklist. Your checklist should be unique to your environment and should help find and fix issues as quickly as possible.

$ sqoop import ... --map-column-java id=String,value=Integer

Preface

* *

 

Section 1: DBA Morning Checklist

Backups

- Verify that the Network Backups are good by checking the backup emails. If a backup did not complete, contact _____ in the networking group, and send an email to the DBA group.

- Check the SQL Server backups. If a backup failed, research the cause of the failure and ensure that it is scheduled to run tonight.

- Check the database backup run duration of all production servers. Verify that the average time is within the normal range. Any significant increases in backup duration times need to be emailed to the networking group, requesting an explanation. The reason for this is that networking starts placing databases backups to tape at certain times, and if they put it to tape before the DBAs are done backing up, the tape copy will be bad.

- Verify that all databases were backed up. If any new databases were not backed up, create a backup maintenance plan for them and check the current schedule to determine a backup time.

Disk Space

  • Verify the free space on each drive of the servers. If there is significant variance in free space from the day before, research the cause of the free space fluctuation and resolve if necessary. Often times, log files will grow because of monthly jobs.

Job Failures

  • Check for failed jobs, by connecting to each SQL Server, selecting "job activity" and filtering on failed jobs. If a job failed, resolve the issue by contacting the owner of the job if necessary.

System Checks

- Check SQL logs on each server. In the event of a critical error, notify the DBA group and come to an agreement on how to resolve the problem.

- Check Application log on each server. In the event of a critical or unusual error, notify the DBA group and the networking group to determine what needs to be done to fix the error.

Performance

- Check Performance statistics for All Servers using the monitoring tool and research and resolve any issues.

- Check Performance Monitor on ALL production servers and verify that all counters are within the normal range.

Connectivity

  • Log into the Customer application and verify that it can connect to the database and pull up data. Verify that it is performing at an acceptable speed. In the event of a failure, email the Customer Support Group, DBA group, and the DBA manager, before proceeding to resolve the issue.

- Log into the Billing application and verify that it can connect to the database and pull up data. Verify that it is performing at an acceptable speed. In the event of a failure, email the Billing Support Group, DBA group, and the DBA manager, before proceeding to resolve the issue.

Replication

- Check replication on each server by checking each publication to make sure the distributor is running for each subscription.

- When replication is stopped, or changes to replication are made, send an email to the DBA group. For example, if the DBA stops the distributor, let the other DBAs know when it is stopped and then when it is restarted again.

- Check for any emails for the SQL Jobs that monitor row counts on major tables on the publisher and subscriber. If a wide variance occurs, send an email message to the DBAs and any appropriate IS personnel.

Sqoop will rise exception in case that some configured mapping will not be used.

 

    I've got a db design job about meeting room booking system last week.There're many suitable tools which can be used to handle this kind of job such as power designer,ERwin,HeidiSQL,dbschema,etc.Today,I'm gonna demonstrate the last one —— dbschema.This is the official website:**https://www.dbschema.com

    We all know that Xtrabackup is a backup tool of percona for innodb or Xtradb.It's usually used to back up whole databases physically.But how to restore a single innodb table from a full Xtrabackup after dropping it by accident?We can also make use of the feature of transportable tablespace to accomplish it.

Section 2: Write down any issues and how they were resolved

This space is reserved for writing down issues and how they were fixed.

You should specify append mode when importing a table where new rows are continually being added with increasing row id values. You specify the column containing the row’s id with --check-column. Sqoop imports rows where the check column has a value greater than the one specified with --last-value.

    As we all know,it's a common sense that separate reading and writing operations can immensely increse the performance of MySQL database.Especially the query operations by executing select statement relevant with large tables.Therefore,we usually choose a proxy tool to deal with it.There're a lot of tools can be used nowadays such as mycat(by Apache),dble(based on mycat by Action),atlas,dbproxy(based on atlas of Qihoo360 by MeituanDianping),cetus(by NetEase) and so forth.I'm not going to compare who's the better tool to use.I'm just prefer to having a test on another popular tool which is called "ProxySQL".

 

    I've demonstrated how to restore a table on a server to another using transportable tablespace in my yesterday's blog.Today,we'll use another tool called "mysqlfrm" to fetch the structure of table in .frm files.

Section 3 - Confirmation

Completed By __________________________ Date: ___________________

At the end of an incremental import, the value which should be specified as --last-value for a subsequent import is printed to the screen. When running a subsequent import, you should specify --last-value in this way to ensure you import only the new or updated data.

 

Introduce

 

Conclusion

Creating a morning DBA checklist has helped me many times in the past. Often times, I found CPU usage up near 100%, broken replication, connectivity problems, and space issues that I have been able to resolve before the majority of the work force was present and the issue could escalate. By having a standard DBA checklist document, it ensures that nothing is forgotten, which could result in a problem. It also minimizes down time of a company or department, provides a archive of past issues and how they were fixed, and helps ensure that the DBA will have a less stressful day! 

You can import data in one of two file formats: delimited text or SequenceFiles.

Introduce

 

Introduce

Delimited text is appropriate for most non-binary data types. It also readily supports further manipulation by other tools, such as Hive.

 

    dbschema is a rather simply used tool even you're a novice in designing db system.The dbschema which is downloaded on official website only free for 15 days,then you have to pay for license for later useage but there's no limit in function at all.It provide two modes in designing layout of your system.One is offline mode and the other one is connecting to db servers.You can easily synchronize tables of database with your designed tables as soon as possible by refreshing them from time to time.It also supports almost all popular rdbms such as oracle,db2,MySQL,postgreSQL.There're many key features which you can found in the homepage of official website.I'm not going to describe each one of them.

 

reading from SequenceFiles is higher-performance than reading from text files, as records do not need to be parsed

    ProxySQL is a low-weight proxy tool based on a SQLite database.It provids hight performance espcially in high concurrent environment what we can see below(compared with the MaxScale).

 

    mysqlfrm is a tool designed to diagnose information in .frm file when in recovery scenario.mysqlfrm provides two modes of operatins.By default,it creates a new instance referencing the base directory using "--basedir" it also need a port specified by "--port" which ought to be diffrent from the one used in the original instance.The other mode is connecting to the already exist instance using "--server".The new instance will be shutdown and all temperary files will be deleted after it reads data in .frm files.Further more,there're two exclusions when using mysqlfrm,one is foreign key constraints,the other one is auto increment number sequences.

By default, data is not compressed. You can compress your data by using the deflate (gzip) algorithm with the -z or --compress argument, or specify any Hadoop compression codec using the --compression-codec argument. This applies to SequenceFile, text, and Avro files.

 

Procedure

 

While the choice of delimiters is most important for a text-mode import, it is still relevant if you import to SequenceFiles with --as-sequencefile. The generated class' toString() method will use the delimiters you specify, so subsequent formatting of the output data will rely on the delimiters you choose.

图片 1

 

Example

When Sqoop imports data to HDFS, it generates a Java class which can reinterpret the text files that it creates when doing a delimited-format import.

 

*    The meeting room booking system(I'll call it "mrbs" .) I  contains four tables:employee,department,conference_room,room_reservation.The detail of tables shows below.
*

 

When Sqoop imports data to HDFS, it generates a Java class which can reinterpret the text files that it creates when doing a delimited-format import. The delimiters are chosen with arguments such as --fields-terminated-by; this controls both how the data is written to disk, and how the generated parse() method reinterprets this data. The delimiters used by the parse() method can be chosen independently of the output arguments, by using --input-fields-terminated-by, and so on. This is useful, for example, to generate classes which can parse records created with one set of delimiters, and emit the records to a different set of files using a separate set of delimiters.

    

 

Install mysqlfrm tool.

二〇一四年七月2日  阅读笔记

*    The configuration of ProxySQL is a three-layer structure:*

employee table

 1 [root@zlm1 10:03:25 ~]
 2 #yum install mysql-utilities
 3 
 4 Installed:
 5   mysql-utilities.noarch 0:1.3.6-1.el7                                                                                                               
 6 
 7 Dependency Installed:
 8   mysql-connector-python.noarch 0:1.1.6-1.el7                                                                                                        
 9 
10 Complete!

Hive can put data into partitions for more efficient query performance. You can tell a Sqoop job to import data for Hive into a particular partition by specifying the --hive-partition-key and --hive-partition-value arguments. The partition value must be a string. Please see the Hive documentation for more details on partitioning.

 

1 id             自增id                int(11)
2 user_id        工号                  int(11)
3 user_name      用户名称              varchar(20)
4 user_phone     用户手机号             bigint
5 user_email     用户邮箱              varchar(50)
6 user_dept_id   用户所在部门id         int(11)
7 user_status    在职、离职等           tinyint(4)
8 create_time    用户创建时间           datetime
9 update_time    用户信息修改时间       datetime

 

7.3. Example Invocations

图片 2

 

**Generate a Xtrabackup backup.**

The following examples illustrate how to use the import tool in a variety of situations.

 

**department table**

 1 [root@zlm1 10:07:36 ~]
 2 #innobackupex --defaults-file=/data/mysql/mysql3306/my3306.cnf --host=localhost --user=root --password=Passw0rd --sock=/tmp/mysql3306.sock /data/backup
 3 180706 10:09:22 innobackupex: Starting the backup operation
 4 
 5 IMPORTANT: Please check that the backup run completes successfully.
 6            At the end of a successful backup run innobackupex
 7            prints "completed OK!".
 8 
 9 //Omitts the intermedia output.
10 
11 180706 10:10:27 Finished backing up non-InnoDB tables and files
12 180706 10:10:27 [00] Writing xtrabackup_binlog_info
13 180706 10:10:27 [00]        ...done
14 180706 10:10:27 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...
15 xtrabackup: The latest check point (for incremental): '1703733455'
16 xtrabackup: Stopping log copying thread.
17 .180706 10:10:27 >> log scanned up to (1703733464)
18 
19 180706 10:10:27 Executing UNLOCK TABLES
20 180706 10:10:27 All tables unlocked
21 180706 10:10:27 [00] Copying ib_buffer_pool to /data/backup/2018-07-06_10-09-22/ib_buffer_pool
22 180706 10:10:27 [00]        ...done
23 180706 10:10:27 Backup created in directory '/data/backup/2018-07-06_10-09-22'
24 MySQL binlog position: filename 'mysql-bin.000071', position '194', GTID of the last change '2a4b3562-2ab6-11e8-be7a-080027de0e0e:1-12715494'
25 180706 10:10:27 [00] Writing backup-my.cnf
26 180706 10:10:27 [00]        ...done
27 180706 10:10:27 [00] Writing xtrabackup_info
28 180706 10:10:27 [00]        ...done
29 xtrabackup: Transaction log of lsn (1703733455) to (1703733464) was copied.
30 180706 10:10:27 completed OK!
31 
32 [root@zlm1 10:10:27 ~]
33 #cd /data/backup
34 
35 [root@zlm1 10:13:14 /data/backup]
36 #ls -l
37 total 4
38 drwxr-x--- 9 root root 4096 Jul  6 10:10 2018-07-06_10-09-22
39 
40 [root@zlm1 10:13:15 /data/backup]
41 #cd 2018-07-06_10-09-22/
42 
43 [root@zlm1 10:13:19 /data/backup/2018-07-06_10-09-22]
44 #ls -l
45 total 102468
46 drwxr-x--- 2 root root        51 Jul  6 10:10 aaron8219
47 -rw-r----- 1 root root       433 Jul  6 10:10 backup-my.cnf
48 drwxr-x--- 2 root root        19 Jul  6 10:10 -help
49 -rw-r----- 1 root root      9492 Jul  6 10:10 ib_buffer_pool
50 -rw-r----- 1 root root 104857600 Jul  6 10:09 ibdata1
51 drwxr-x--- 2 root root      4096 Jul  6 10:10 mysql
52 drwxr-x--- 2 root root      8192 Jul  6 10:10 performance_schema
53 drwxr-x--- 2 root root      8192 Jul  6 10:10 sys
54 drwxr-x--- 2 root root      4096 Jul  6 10:10 sysbench
55 -rw-r----- 1 root root        69 Jul  6 10:10 xtrabackup_binlog_info
56 -rw-r----- 1 root root       119 Jul  6 10:10 xtrabackup_checkpoints
57 -rw-r----- 1 root root       639 Jul  6 10:10 xtrabackup_info
58 -rw-r----- 1 root root      2560 Jul  6 10:10 xtrabackup_logfile
59 drwxr-x--- 2 root root      4096 Jul  6 10:10 zlm

A basic import of a table named EMPLOYEES in the corp database:

  *  We usually conifuge the parameter in layer of memory,and then load them into layer of runtime to make it take effect.In the end,we should save them to disk for durability storage.It also provides some simple syntax to transfer configurations between those layers as below:*

1 id              自增id                 int(11)
2 dept_id         部门id                 int(11)
3 dept_name       部门名称               varchar(30)
4 parent_id       父级id                 tinyint(4)
5 tlevel          层级id                 tinyint(4)
6 create_time     部门创建时间            datetime    
7 update_time     部门信息修改时间        datetime

 

$ sqoop import --connect jdbc:mysql://db.foo.com/corp --table EMPLOYEES

 

 

***Prepare the backup.***

A basic import requiring a login:

  • LOAD MYSQL object FROM MEMORY or LOAD MYSQL object TO RUNTIME 
  • SAVE MYSQL object TO MEMORY or SAVE MYSQL object FROM RUNTIME 
  • LOAD MYSQL object TO MEMORY or LOAD MYSQL object FROM DISK 
  • SAVE MYSQL object FROM MEMORY or SAVE MYSQL object TO DISK 
  • LOAD MYSQL object FROM CONFIG

**conference_room table**

 1 [root@zlm1 10:17:32 /data/backup/2018-07-06_10-09-22]
 2 #innobackupex --defaults-file=/data/mysql/mysql3306/my3306.cnf --host=localhost --user=root --password=Passw0rd --sock=/tmp/mysql3306.sock --apply-log /data/backup/2018-07-06_10-09-22/
 3 180706 10:18:21 innobackupex: Starting the apply-log operation
 4 
 5 IMPORTANT: Please check that the apply-log run completes successfully.
 6            At the end of a successful apply-log run innobackupex
 7            prints "completed OK!".
 8            
 9 //Omitts the intermedia output.
10 
11 InnoDB: Database was not shutdown normally!
12 InnoDB: Starting crash recovery.
13 InnoDB: xtrabackup: Last MySQL binlog file position 139807334, file name mysql-bin.000069
14 InnoDB: Removed temporary tablespace data file: "ibtmp1"
15 InnoDB: Creating shared tablespace for temporary tables
16 InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
17 InnoDB: File './ibtmp1' size is now 12 MB.
18 InnoDB: 96 redo rollback segment(s) found. 1 redo rollback segment(s) are active.
19 InnoDB: 32 non-redo rollback segment(s) are active.
20 InnoDB: Waiting for purge to start
21 InnoDB: 5.7.13 started; log sequence number 1703733781
22 xtrabackup: starting shutdown with innodb_fast_shutdown = 1
23 InnoDB: page_cleaner: 1000ms intended loop took 10865ms. The settings might not be optimal. (flushed=0 and evicted=0, during the time.)
24 InnoDB: FTS optimize thread exiting.
25 InnoDB: Starting shutdown...
26 InnoDB: Shutdown completed; log sequence number 1703733800
27 180706 10:18:36 completed OK!

$ sqoop import --connect jdbc:mysql://db.foo.com/corp --table EMPLOYEES

 

1 id                 自增id                int(11)
2 room_id            会议室id              int(11)
3 room_building_id   楼号                  int(11)
4 room_num           房间号                int(11)
5 room_max_num       最大容纳人数           int(11)
6 room_status        会议室状态            tinyint(4)
7 create_time        会议室创建时间         datetime
8 update_time        会议室信息修改时间     datetime

 

--username SomeUser -P

The comparison with other popular middleware tools.

 

**Drop two tables to mimic misoperation.**

Enter password: (hidden)

**room_reservation table**

 1 root@localhost:mysql3306.sock [(none)]>show tables from zlm;
 2  ---------------- 
 3 | Tables_in_zlm  |
 4  ---------------- 
 5 | customer       |
 6 | goods          |
 7 | semi_sync_test |
 8 | test_flashbk   |
 9 | test_myisam    |
10  ---------------- 
11 5 rows in set (0.00 sec)
12 
13 root@localhost:mysql3306.sock [(none)]>show tables from sysbench;
14  -------------------- 
15 | Tables_in_sysbench |
16  -------------------- 
17 | sbtest1            |
18 | sbtest10           |
19 | sbtest2            |
20 | sbtest3            |
21 | sbtest4            |
22 | sbtest5            |
23 | sbtest6            |
24 | sbtest7            |
25 | sbtest8            |
26 | sbtest9            |
27  -------------------- 
28 10 rows in set (0.00 sec)
29 
30 root@localhost:mysql3306.sock [(none)]>drop table zlm.test_flashbk,sysbench.sbtest1;
31 Query OK, 0 rows affected (0.11 sec)

Selecting specific columns from the EMPLOYEES table:

 

 1 会议室预定表(room_reservation)
 2 id                  自增id               int(11)
 3 book_id             预定工单id            int(11)
 4 book_room_id        预定会议室id          int(11)
 5 book_start_time     预定开始时间          datetime
 6 book_stop_time      预定结束时间          datetime
 7 book_user_id        预定人id              int(11)
 8 book_usage          预定用途              varchar(200)
 9 book_status         预定工单状态           tinyint(4)
10 create_time         预定工单创建时间       datetime
11 update_time         预定工单修改时间       datetime

 

$ sqoop import --connect jdbc:mysql://db.foo.com/corp --table EMPLOYEES

Official website:

 

***Diagnose .frm file from Xtrabackup using mysqlfrm.***

--columns "employee_id,first_name,last_name,job_title"

http://www.proxysql.com/

**Configure the database connection.**

 1 [root@zlm1 10:35:56 /data/backup/2018-07-06_10-09-22]
 2 #mysqlfrm --basedir=/usr/local/mysql --port=8219 --diagnostic /data/backup/2018-07-06_10-09-22/sysbench/sbtest1.frm /data/backup/2018-07-06_10-09-22/zlm/zlm:test_flashbk.frmsbtest1.frm /data/backup/2018-07-06_10-09-22/zlm/zlm:tes
 3 # WARNING The --port option is not used in the --diagnostic mode.
 4 # WARNING: Cannot generate character set or collation names without the --server option.
 5 # CAUTION: The diagnostic mode is a best-effort parse of the .frm file. As such, it may not identify all of the components of the table correctly. This is especially true for damaged files. It will also not read the default values for the columns and the resulting statement may not be syntactically correct.
 6 # Reading .frm file for /data/backup/2018-07-06_10-09-22/sysbench/sbtest1.frm:
 7 # The .frm file is a TABLE.
 8 # CREATE TABLE Statement:
 9 
10 CREATE TABLE `sysbench`.`sbtest1` (
11   `id` int(11) NOT NULL AUTO_INCREMENT, 
12   `k` int(11) NOT NULL, 
13   `c` char(360) NOT NULL, 
14   `pad` char(180) NOT NULL, 
15 PRIMARY KEY `PRIMARY` (`id`),
16 KEY `k_1` (`k`)
17 ) ENGINE=InnoDB;
18 
19 # Reading .frm file for /data/backup/2018-07-06_10-09-22/zlm/test_flashbk.frm:
20 # The .frm file is a TABLE.
21 # CREATE TABLE Statement:
22 
23 CREATE TABLE `zlm`.`test_flashbk` (
24   `id` bigint(20) NOT NULL AUTO_INCREMENT, 
25   `name` varchar(80) NOT NULL, 
26 PRIMARY KEY `PRIMARY` (`id`)
27 ) ENGINE=InnoDB;
28 
29 #...done.
30 
31 //If you want to get the information of character set,"--server" is indispensable.
32 //you can either use <dbname>:<tablename>.frm or just <tablename>.frm.
33 //"--port" can be omitted.

Controlling the import parallelism (using 8 parallel tasks):

 

图片 3

 

$ sqoop import --connect jdbc:mysql://db.foo.com/corp --table EMPLOYEES

Github websit:

 

**Create vacant table using above create statement.**

-m 8

https://github.com/sysown/proxysql/releases/

Use mouse to create target tables in dbschema.

 1 root@localhost:mysql3306.sock [(none)]>CREATE TABLE `sysbench`.`sbtest1` (
 2     ->   `id` int(11) NOT NULL AUTO_INCREMENT, 
 3     ->   `k` int(11) NOT NULL, 
 4     ->   `c` char(360) NOT NULL, 
 5     ->   `pad` char(180) NOT NULL, 
 6     -> PRIMARY KEY `PRIMARY` (`id`),
 7     -> KEY `k_1` (`k`)
 8     -> ) ENGINE=InnoDB;
 9 ERROR 1074 (42000): Column length too big for column 'c' (max = 255); use BLOB or TEXT instead
10 root@localhost:mysql3306.sock [(none)]>CREATE TABLE `zlm`.`test_flashbk` (
11     ->   `id` bigint(20) NOT NULL AUTO_INCREMENT, 
12     ->   `name` varchar(80) NOT NULL, 
13     -> PRIMARY KEY `PRIMARY` (`id`)
14     -> ) ENGINE=InnoDB;
15 Query OK, 0 rows affected (0.02 sec)
16 
17 //We get an error when creating table sysbench.sbtest1 beause of the overload value of char.

Storing data in SequenceFiles, and setting the generated class name to com.foocorp.Employee:

 

 图片 4

 

$ sqoop import --connect jdbc:mysql://db.foo.com/corp --table EMPLOYEES

Percona websit:

 

**Check the structure of  sbtest2 table in sysbench.**

--class-name com.foocorp.Employee --as-sequencefile

Check the primary key & unique key(even other keys but I'm not creating them ye).

 1 root@localhost:mysql3306.sock [(none)]>show create table sysbench.sbtest2G
 2 *************************** 1. row ***************************
 3        Table: sbtest2
 4 Create Table: CREATE TABLE `sbtest2` (
 5   `id` int(11) NOT NULL AUTO_INCREMENT,
 6   `k` int(11) NOT NULL DEFAULT '0',
 7   `c` char(120) NOT NULL DEFAULT '', //In the counterpart table,the value is 120.
 8   `pad` char(60) NOT NULL DEFAULT '', //In the counterpart table,the value is 60.
 9   PRIMARY KEY (`id`),
10   KEY `k_2` (`k`)
11 ) ENGINE=InnoDB AUTO_INCREMENT=100001 DEFAULT CHARSET=utf8
12 1 row in set (0.00 sec)

Specifying the delimiters to use in a text-mode import:

 

 图片 5

 

$ sqoop import --connect jdbc:mysql://db.foo.com/corp --table EMPLOYEES

Procedure

 

***Change the create statement reference to the value in sbtest2.***

--fields-terminated-by 't' --lines-terminated-by 'n'

 

Check the foreign key.

 1 when creating table sysbench.sbtest1 beause of the overload value of char.
 2 
 3 
 4 Check the structure of  sbtest2 table in sysbench.
 5  1 root@localhost:mysql3306.sock [(none)]>show create table sysbench.sbtest2G
 6  2 *************************** 1. row ***************************
 7  3        Table: sbtest2
 8  4 Create Table: CREATE TABLE `sbtest2` (
 9  5   `id` int(11) NOT NULL AUTO_INCREMENT,
10  6   `k` int(11) NO

--optionally-enclosed-by '"'

1. Installation

 图片 6

 

Importing the data to Hive:

 

 

**Add a write lock on these two tables.**

$ sqoop import --connect jdbc:mysql://db.foo.com/corp --table EMPLOYEES

Download and install ProxySQL in rpm mode on node zlm2.

After you click ok button,the table will be created in "mrbs" database.

 1 root@localhost:mysql3306.sock [(none)]>lock tables sysbench.sbtest1 write;
 2 Query OK, 0 rows affected (0.00 sec)
 3 
 4 root@localhost:mysql3306.sock [(none)]>lock tables zlm.test_flashbk write;
 5 Query OK, 0 rows affected (0.00 sec)
 6 
 7 root@localhost:mysql3306.sock [(none)]>alter table sysbench.sbtest1 discard tablespace;
 8 ERROR 1100 (HY000): Table 'sbtest1' was not locked with LOCK TABLES //when locks another table,the lock on previous table will be released.
 9 root@localhost:mysql3306.sock [(none)]>lock tables sysbench.sbtest1 write; //This time,lock one and discard one in order.
10 Query OK, 0 rows affected (0.00 sec)
11 
12 root@localhost:mysql3306.sock [(none)]>alter table sysbench.sbtest1 discard tablespace;
13 Query OK, 0 rows affected (0.00 sec)
14 
15 root@localhost:mysql3306.sock [(none)]>lock tables zlm.test_flashbk write;
16 Query OK, 0 rows affected (0.00 sec)
17 
18 root@localhost:mysql3306.sock [(none)]>alter table zlm.test_flashbk discard tablespace;
19 Query OK, 0 rows affected (0.00 sec)

--hive-import

 1 [root@zlm2 08:00:34 ~]
 2 #wget https://github.com/sysown/proxysql/releases/download/v1.4.10/proxysql-1.4.10-1-centos7.x86_64.rpm
 3 --2018-08-10 08:01:20--  https://github.com/sysown/proxysql/releases/download/v1.4.10/proxysql-1.4.10-1-centos7.x86_64.rpm
 4 Resolving github.com (github.com)... 13.229.188.59, 52.74.223.119, 13.250.177.223
 5 Connecting to github.com (github.com)|13.229.188.59|:443... connected.
 6 HTTP request sent, awaiting response... 302 Found
 7 Location: https://github-production-release-asset-2e65be.s3.amazonaws.com/27358084/305d1618-9a5b-11e8-9b04-df8e3393a6b4?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A/20180810/us-east-1/s3/aws4_request&X-Amz-Date=20180810T060119Z&X-Amz-Expires=300&X-Amz-Signature=f4f676653d4cd4f34f4b75b2a4e7dfeaee8bf9d8e35dc91938f41961123a9c9f&X-Amz-SignedHeaders=host&actor_id=0&response-content-disposition=attachment; filename=proxysql-1.4.10-1-centos7.x86_64.rpm&response-content-type=application/octet-stream [following]
 8 --2018-08-10 08:01:29--  https://github-production-release-asset-2e65be.s3.amazonaws.com/27358084/305d1618-9a5b-11e8-9b04-df8e3393a6b4?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A/20180810/us-east-1/s3/aws4_request&X-Amz-Date=20180810T060119Z&X-Amz-Expires=300&X-Amz-Signature=f4f676653d4cd4f34f4b75b2a4e7dfeaee8bf9d8e35dc91938f41961123a9c9f&X-Amz-SignedHeaders=host&actor_id=0&response-content-disposition=attachment; filename=proxysql-1.4.10-1-centos7.x86_64.rpm&response-content-type=application/octet-stream
 9 Resolving github-production-release-asset-2e65be.s3.amazonaws.com (github-production-release-asset-2e65be.s3.amazonaws.com)... 54.231.32.91
10 Connecting to github-production-release-asset-2e65be.s3.amazonaws.com (github-production-release-asset-2e65be.s3.amazonaws.com)|54.231.32.91|:443... connected.
11 HTTP request sent, awaiting response... 200 OK
12 Length: 5982016 (5.7M) [application/octet-stream]
13 Saving to: ‘proxysql-1.4.10-1-centos7.x86_64.rpm’
14 
15 100%[===========================================================================================================>] 5,982,016   1.12MB/s   in 12s    
16 
17 2018-08-10 08:02:14 (472 KB/s) - ‘proxysql-1.4.10-1-centos7.x86_64.rpm’ saved [5982016/5982016]
18 
19 
20 [root@zlm2 08:02:14 ~]
21 #ls -l|grep proxysql
22 -rw-r--r--   1 root root    5982016 Aug  7 15:03 proxysql-1.4.10-1-centos7.x86_64.rpm
23 
24 [root@zlm2 08:04:48 ~]
25 #yum -y localinstall proxysql-1.4.10-1-centos7.x86_64.rpm
26 Loaded plugins: fastestmirror
27 Examining proxysql-1.4.10-1-centos7.x86_64.rpm: proxysql-1.4.10-1.x86_64
28 Marking proxysql-1.4.10-1-centos7.x86_64.rpm to be installed
29 Resolving Dependencies
30 --> Running transaction check
31 ---> Package proxysql.x86_64 0:1.4.10-1 will be installed
32 --> Finished Dependency Resolution
33 
34 Dependencies Resolved
35 
36 =====================================================================================================================================================
37  Package                      Arch                       Version                         Repository                                             Size
38 =====================================================================================================================================================
39 Installing:
40  proxysql                     x86_64                     1.4.10-1                        /proxysql-1.4.10-1-centos7.x86_64                      22 M
41 
42 Transaction Summary
43 =====================================================================================================================================================
44 Install  1 Package
45 
46 Total size: 22 M
47 Installed size: 22 M
48 Downloading packages:
49 Running transaction check
50 Running transaction test
51 Transaction test succeeded
52 Running transaction
53   Installing : proxysql-1.4.10-1.x86_64                                                                                                          1/1 
54   Verifying  : proxysql-1.4.10-1.x86_64                                                                                                          1/1 
55 
56 Installed:
57   proxysql.x86_64 0:1.4.10-1                                                                                                                         
58 
59 Complete!

图片 7

 

Importing only new employees:

 

 

**Copy .ibd files from Xtrabackup and change privilege.**

$ sqoop import --connect jdbc:mysql://db.foo.com/corp --table EMPLOYEES

Check the default configuration file.

Check the tables in "mrbys".

 1 [root@zlm1 11:06:18 /data/backup/2018-07-06_10-09-22]
 2 #cp sysbench/sbtest1.ibd /data/mysql/mysql3306/data/sysbench
 3 
 4 [root@zlm1 11:07:50 /data/backup/2018-07-06_10-09-22]
 5 #cp zlm/test_flashbk.ibd /data/mysql/mysql3306/data/zlm
 6 
 7 [root@zlm1 11:08:05 /data/backup/2018-07-06_10-09-22]
 8 #chown -R mysql.mysql /data/mysql/mysql3306/data
 9 
10 [root@zlm1 11:11:25 /data/backup/2018-07-06_10-09-22]
11 #ls -l /data/mysql/mysql3306/data/sysbench | grep sbtest1.ibd
12 -rw-r----- 1 mysql mysql 33554432 Jul  6 11:07 sbtest1.ibd
13 
14 [root@zlm1 11:12:39 /data/backup/2018-07-06_10-09-22]
15 #ls -l /data/mysql/mysql3306/data/zlm | grep test_flashbk.ibd
16 -rw-r----- 1 mysql mysql  12582912 Jul  6 11:08 test_flashbk.ibd

--where "start_date > '2010-01-01'"

  1 [root@zlm2 08:05:09 ~]
  2 #rpm -ql proxysql
  3 /etc/init.d/proxysql
  4 /etc/proxysql.cnf //This is the configuration file of ProxySQL.
  5 /usr/bin/proxysql
  6 /usr/share/proxysql/tools/proxysql_galera_checker.sh //This is the script to check the status of hostgroups.
  7 /usr/share/proxysql/tools/proxysql_galera_writer.pl
  8 
  9 [root@zlm2 08:08:13 ~]
 10 #cat /etc/proxysql.cnf
 11 #file proxysql.cfg
 12 
 13 ########################################################################################
 14 # This config file is parsed using libconfig , and its grammar is described in:        
 15 # http://www.hyperrealm.com/libconfig/libconfig_manual.html#Configuration-File-Grammar 
 16 # Grammar is also copied at the end of this file                                       
 17 ########################################################################################
 18 
 19 ########################################################################################
 20 # IMPORTANT INFORMATION REGARDING THIS CONFIGURATION FILE:                             
 21 ########################################################################################
 22 # On startup, ProxySQL reads its config file (if present) to determine its datadir. 
 23 # What happens next depends on if the database file (disk) is present in the defined
 24 # datadir (i.e. "/var/lib/proxysql/proxysql.db").
 25 #
 26 # If the database file is found, ProxySQL initializes its in-memory configuration from 
 27 # the persisted on-disk database. So, disk configuration gets loaded into memory and 
 28 # then propagated towards the runtime configuration. 
 29 #
 30 # If the database file is not found and a config file exists, the config file is parsed 
 31 # and its content is loaded into the in-memory database, to then be both saved on-disk 
 32 # database and loaded at runtime.
 33 #
 34 # IMPORTANT: If a database file is found, the config file is NOT parsed. In this case
 35 #            ProxySQL initializes its in-memory configuration from the persisted on-disk
 36 #            database ONLY. In other words, the configuration found in the proxysql.cnf
 37 #            file is only used to initial the on-disk database read on the first startup.
 38 #
 39 # In order to FORCE a re-initialise of the on-disk database from the configuration file 
 40 # the ProxySQL service should be started with "service proxysql initial".
 41 #
 42 ########################################################################################
 43 
 44 datadir="/var/lib/proxysql" //This is the position of SQLite database of ProxySQL.
 45 
 46 admin_variables=
 47 {
 48     admin_credentials="admin:admin" //The default user/password is admin/admin.
 49 #    mysql_ifaces="127.0.0.1:6032;/tmp/proxysql_admin.sock"
 50     mysql_ifaces="0.0.0.0:6032" //The default admin port is 6032.
 51 #    refresh_interval=2000
 52 #    debug=true
 53 }
 54 
 55 mysql_variables=
 56 {
 57     threads=4
 58     max_connections=2048
 59     default_query_delay=0
 60     default_query_timeout=36000000
 61     have_compress=true
 62     poll_timeout=2000
 63 #    interfaces="0.0.0.0:6033;/tmp/proxysql.sock"
 64     interfaces="0.0.0.0:6033" //The default service port is 6033.
 65     default_schema="information_schema"
 66     stacksize=1048576
 67     server_version="5.5.30"
 68     connect_timeout_server=3000
 69 # make sure to configure monitor username and password
 70 # https://github.com/sysown/proxysql/wiki/Global-variables#mysql-monitor_username-mysql-monitor_password
 71     monitor_username="monitor"
 72     monitor_password="monitor"
 73     monitor_history=600000
 74     monitor_connect_interval=60000
 75     monitor_ping_interval=10000
 76     monitor_read_only_interval=1500
 77     monitor_read_only_timeout=500
 78     ping_interval_server_msec=120000
 79     ping_timeout_server=500
 80     commands_stats=true
 81     sessions_sort=true
 82     connect_retries_on_failure=10
 83 }
 84 
 85 
 86 # defines all the MySQL servers
 87 mysql_servers =
 88 (
 89 #    {
 90 #        address = "127.0.0.1" # no default, required . If port is 0 , address is interpred as a Unix Socket Domain
 91 #        port = 3306           # no default, required . If port is 0 , address is interpred as a Unix Socket Domain
 92 #        hostgroup = 0            # no default, required
 93 #        status = "ONLINE"     # default: ONLINE
 94 #        weight = 1            # default: 1
 95 #        compression = 0       # default: 0
 96 #   max_replication_lag = 10  # default 0 . If greater than 0 and replication lag passes such threshold, the server is shunned
 97 #    },
 98 #    {
 99 #        address = "/var/lib/mysql/mysql.sock"
100 #        port = 0
101 #        hostgroup = 0
102 #    },
103 #    {
104 #        address="127.0.0.1"
105 #        port=21891
106 #        hostgroup=0
107 #        max_connections=200
108 #    },
109 #    { address="127.0.0.2" , port=3306 , hostgroup=0, max_connections=5 },
110 #    { address="127.0.0.1" , port=21892 , hostgroup=1 },
111 #    { address="127.0.0.1" , port=21893 , hostgroup=1 }
112 #    { address="127.0.0.2" , port=3306 , hostgroup=1 },
113 #    { address="127.0.0.3" , port=3306 , hostgroup=1 },
114 #    { address="127.0.0.4" , port=3306 , hostgroup=1 },
115 #    { address="/var/lib/mysql/mysql.sock" , port=0 , hostgroup=1 }
116 )
117 
118 
119 # defines all the MySQL users
120 mysql_users:
121 (
122 #    {
123 #        username = "username" # no default , required
124 #        password = "password" # default: ''
125 #        default_hostgroup = 0 # default: 0
126 #        active = 1            # default: 1
127 #    },
128 #    {
129 #        username = "root"
130 #        password = ""
131 #        default_hostgroup = 0
132 #        max_connections=1000
133 #        default_schema="test"
134 #        active = 1
135 #    },
136 #    { username = "user1" , password = "password" , default_hostgroup = 0 , active = 0 }
137 )
138 
139 
140 
141 #defines MySQL Query Rules
142 mysql_query_rules:
143 (
144 #    {
145 #        rule_id=1
146 #        active=1
147 #        match_pattern="^SELECT .* FOR UPDATE$"
148 #        destination_hostgroup=0
149 #        apply=1
150 #    },
151 #    {
152 #        rule_id=2
153 #        active=1
154 #        match_pattern="^SELECT"
155 #        destination_hostgroup=1
156 #        apply=1
157 #    }
158 )
159 
160 scheduler=
161 (
162 #  {
163 #    id=1
164 #    active=0
165 #    interval_ms=10000
166 #    filename="/var/lib/proxysql/proxysql_galera_checker.sh"
167 #    arg1="0"
168 #    arg2="0"
169 #    arg3="0"
170 #    arg4="1"
171 #    arg5="/var/lib/proxysql/proxysql_galera_checker.log"
172 #  }
173 )
174 
175 
176 mysql_replication_hostgroups=
177 (
178 #        {
179 #                writer_hostgroup=30
180 #                reader_hostgroup=40
181 #                comment="test repl 1"
182 #       },
183 #       {
184 #                writer_hostgroup=50
185 #                reader_hostgroup=60
186 #                comment="test repl 2"
187 #        }
188 )
189 
190 
191 
192 
193 # http://www.hyperrealm.com/libconfig/libconfig_manual.html#Configuration-File-Grammar
194 #
195 # Below is the BNF grammar for configuration files. Comments and include directives are not part of the grammar, so they are not included here. 
196 #
197 # configuration = setting-list | empty
198 #
199 # setting-list = setting | setting-list setting
200 #     
201 # setting = name (":" | "=") value (";" | "," | empty)
202 #     
203 # value = scalar-value | array | list | group
204 #     
205 # value-list = value | value-list "," value
206 #     
207 # scalar-value = boolean | integer | integer64 | hex | hex64 | float
208 #                | string
209 #     
210 # scalar-value-list = scalar-value | scalar-value-list "," scalar-value
211 #     
212 # array = "[" (scalar-value-list | empty) "]"
213 #     
214 # list = "(" (value-list | empty) ")"
215 #     
216 # group = "{" (setting-list | empty) "}"
217 #     
218 # empty =
 1 root@localhost:mysql3306.sock [mrbs]>show tables;
 2  ------------------ 
 3 | Tables_in_mrbs   |
 4  ------------------ 
 5 | conference_room  |
 6 | department       |
 7 | employee         |
 8 | room_reservation |
 9  ------------------ 
10 4 rows in set (0.01 sec)
11 
12 root@localhost:mysql3306.sock [mrbs]>show create table employeeG
13 *************************** 1. row ***************************
14        Table: employee
15 Create Table: CREATE TABLE `employee` (
16   `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
17   `user_id` int(10) unsigned NOT NULL COMMENT '工号',
18   `user_name` varchar(20) NOT NULL COMMENT '用户名称',
19   `user_phone` bigint(20) unsigned NOT NULL COMMENT '用户手机号',
20   `user_email` varchar(50) DEFAULT NULL COMMENT '用户邮箱',
21   `user_dept_id` int(10) unsigned NOT NULL COMMENT '用户所在部门id',
22   `user_status` tinyint(3) unsigned NOT NULL COMMENT '是否在职',
23   `create_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '用户创建时间',
24   `update_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '用户信息修改时间',
25   PRIMARY KEY (`id`),
26   UNIQUE KEY `uk_user_id` (`user_id`)
27 ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='用户表'
28 1 row in set (0.00 sec)
29 
30 root@localhost:mysql3306.sock [mrbs]>show create table departmentG
31 *************************** 1. row ***************************
32        Table: department
33 Create Table: CREATE TABLE `department` (
34   `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
35   `dept_id` int(10) unsigned NOT NULL COMMENT '部门id',
36   `dept_name` varchar(30) NOT NULL COMMENT '部门名称',
37   `parent_id` tinyint(3) unsigned NOT NULL,
38   `tlevel` tinyint(3) unsigned NOT NULL COMMENT '层级',
39   `create_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '部门创建时间',
40   `update_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '部门信息修改时间',
41   PRIMARY KEY (`id`),
42   UNIQUE KEY `uk_dept_id` (`dept_id`)
43 ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='部门表'
44 1 row in set (0.00 sec)
45 
46 root@localhost:mysql3306.sock [mrbs]>show create table conference_roomG
47 *************************** 1. row ***************************
48        Table: conference_room
49 Create Table: CREATE TABLE `conference_room` (
50   `id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT '自增id',
51   `room_id` int(10) unsigned NOT NULL COMMENT '会议室id',
52   `room_building_id` int(10) unsigned NOT NULL COMMENT '楼号',
53   `room_num` int(10) unsigned NOT NULL COMMENT '房间号',
54   `room_max_num` int(10) unsigned NOT NULL COMMENT '最大容纳人数',
55   `room_status` tinyint(3) unsigned NOT NULL COMMENT '会议室状态',
56   `create_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '会议室创建时间',
57   `update_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '会议室信息修改时间',
58   PRIMARY KEY (`id`),
59   UNIQUE KEY `uk_room_id` (`room_id`)
60 ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='会议室表'
61 1 row in set (0.00 sec)
62 
63 root@localhost:mysql3306.sock [mrbs]>show create table room_reservationG
64 *************************** 1. row ***************************
65        Table: room_reservation
66 Create Table: CREATE TABLE `room_reservation` (
67   `id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT '自增id',
68   `book_id` int(10) unsigned NOT NULL COMMENT '预定工单id',
69   `book_room_id` int(10) unsigned NOT NULL COMMENT '预定会议室id',
70   `book_start_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '预定开始时间',
71   `book_stop_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '预定结束时间',
72   `book_user_id` int(10) unsigned NOT NULL COMMENT '预定人id',
73   `book_usage` varchar(200) NOT NULL COMMENT '预定用途',
74   `book_status` tinyint(3) unsigned NOT NULL COMMENT '预定工单状态',
75   `create_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '预定工单创建时间',
76   `update_time` date DEFAULT NULL COMMENT '预定工单修改时间',
77   PRIMARY KEY (`id`),
78   UNIQUE KEY `uk_book_id` (`book_id`)
79 ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='会议室预定表'
80 1 row in set (0.00 sec)

 

Changing the splitting column from the default:

 

 

**Import tablespaces and check data of tables.**

$ sqoop import --connect jdbc:mysql://db.foo.com/corp --table EMPLOYEES

Start ProxySQL and check tables of it.

  

 1 root@localhost:mysql3306.sock [(none)]>alter table sysbench.sbtest1 import tablespace;
 2 Query OK, 0 rows affected, 1 warning (1.73 sec)
 3 
 4 root@localhost:mysql3306.sock [(none)]>show warnings;
 5  --------- ------ -------------------------------------------------------------------------------------------------------------------------------------------------- 
 6 | Level   | Code | Message                                                                                                                                          |
 7  --------- ------ -------------------------------------------------------------------------------------------------------------------------------------------------- 
 8 | Warning | 1810 | InnoDB: IO Read error: (2, No such file or directory) Error opening './sysbench/sbtest1.cfg', will attempt to import without schema verification |
 9  --------- ------ -------------------------------------------------------------------------------------------------------------------------------------------------- 
10 1 row in set (0.00 sec)
11 
12 root@localhost:mysql3306.sock [(none)]>alter table zlm.test_flashbk import tablespace;
13 Query OK, 0 rows affected, 1 warning (1.01 sec)
14 
15 root@localhost:mysql3306.sock [(none)]>show warnings;
16  --------- ------ -------------------------------------------------------------------------------------------------------------------------------------------------- 
17 | Level   | Code | Message                                                                                                                                          |
18  --------- ------ -------------------------------------------------------------------------------------------------------------------------------------------------- 
19 | Warning | 1810 | InnoDB: IO Read error: (2, No such file or directory) Error opening './zlm/test_flashbk.cfg', will attempt to import without schema verification |
20  --------- ------ -------------------------------------------------------------------------------------------------------------------------------------------------- 
21 1 row in set (0.00 sec)
22 
23 root@localhost:mysql3306.sock [(none)]>select count(*) from sysbench.sbtest1;
24  ---------- 
25 | count(*) |
26  ---------- 
27 |   100000 |
28  ---------- 
29 1 row in set (0.25 sec)
30 
31 root@localhost:mysql3306.sock [(none)]>select count(*) from zlm.test_flashbk;
32  ---------- 
33 | count(*) |
34  ---------- 
35 |   100000 |
36  ---------- 
37 1 row in set (0.10 sec)
38 
39 //The warnings show that tthe message about missing of .cfg file what rally doesn't matter.
40 //The .cfg file is usually create by executing "flush table ... for export;"
41 //We can benifit in crash recover scenario with the support of ignoring the missing of .cfg in transportable tablespace feature.

--split-by dept_id

  1 [root@zlm2 08:30:12 ~]
  2 #service proxysql start
  3 Starting ProxySQL: 2018-08-10 08:30:38 [INFO] Using config file /etc/proxysql.cnf
  4 DONE!
  5 
  6 [root@zlm2 08:30:38 ~]
  7 #ps aux|grep proxysql
  8 root      4307  0.0  0.5  58688  5184 ?        S    08:30   0:00 proxysql -c /etc/proxysql.cnf -D /var/lib/proxysql
  9 root      4308  0.3  1.8 102612 19020 ?        Sl   08:30   0:00 proxysql -c /etc/proxysql.cnf -D /var/lib/proxysql
 10 root      4335  0.0  0.0 112640   960 pts/0    R    08:30   0:00 grep --color=auto proxysql
 11 
 12 [root@zlm2 08:32:20 ~]
 13 #cd /var/lib/proxysql
 14 
 15 [root@zlm2 08:33:55 /var/lib/proxysql]
 16 #ls -l
 17 total 244
 18 -rw------- 1 root root 122880 Aug 10 08:16 proxysql.db
 19 -rw------- 1 root root   7320 Aug 10 08:30 proxysql.log
 20 -rw-r--r-- 1 root root      5 Aug 10 08:30 proxysql.pid
 21 -rw------- 1 root root 110592 Aug 10 08:33 proxysql_stats.db
 22 -rw-r--r-- 1 root root      2 Aug 10 08:29 reload
 23 
 24 [root@zlm2 08:33:57 /var/lib/proxysql]
 25 #mysql -uadmin -padmin -h127.0.0.1 -P6032
 26 mysql: [Warning] Using a password on the command line interface can be insecure.
 27 Welcome to the MySQL monitor.  Commands end with ; or g.
 28 Your MySQL connection id is 1
 29 Server version: 5.5.30 (ProxySQL Admin Module)
 30 
 31 Copyright (c) 2009-2018 Percona LLC and/or its affiliates
 32 Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
 33 
 34 Oracle is a registered trademark of Oracle Corporation and/or its
 35 affiliates. Other names may be trademarks of their respective
 36 owners.
 37 
 38 Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
 39 
 40 (admin@127.0.0.1 6032)[(none)]>show databases;
 41  ----- --------------- ------------------------------------- 
 42 | seq | name          | file                                |
 43  ----- --------------- ------------------------------------- 
 44 | 0   | main          |                                     |
 45 | 2   | disk          | /var/lib/proxysql/proxysql.db       |
 46 | 3   | stats         |                                     |
 47 | 4   | monitor       |                                     |
 48 | 5   | stats_history | /var/lib/proxysql/proxysql_stats.db |
 49  ----- --------------- ------------------------------------- 
 50 5 rows in set (0.00 sec)
 51 
 52 //There's five databases in all in PorxySQL database.In the previous version,there's no stats_history database;
 53 
 54 (admin@127.0.0.1 6032)[(none)]>show tables from main;
 55  -------------------------------------------- 
 56 | tables                                     |
 57  -------------------------------------------- 
 58 | global_variables                           | //It contains all the variables of ProxySQL.
 59 | mysql_collations                           |
 60 | mysql_group_replication_hostgroups         |
 61 | mysql_query_rules                          | //It contains all the query rules we'd like to use.
 62 | mysql_query_rules_fast_routing             |
 63 | mysql_replication_hostgroups               |
 64 | mysql_servers                              | //It contains the information of all servers relevant with MySQL.
 65 | mysql_users                                | //It contains the information of all users relevant with MySQL.
 66 | proxysql_servers                           | //It contains the information of all servers relevant with ProxySQL.
 67 | runtime_checksums_values                   |
 68 | runtime_global_variables                   |
 69 | runtime_mysql_group_replication_hostgroups |
 70 | runtime_mysql_query_rules                  |
 71 | runtime_mysql_query_rules_fast_routing     |
 72 | runtime_mysql_replication_hostgroups       |
 73 | runtime_mysql_servers                      |
 74 | runtime_mysql_users                        |
 75 | runtime_proxysql_servers                   |
 76 | runtime_scheduler                          |
 77 | scheduler                                  |
 78  -------------------------------------------- 
 79 20 rows in set (0.00 sec)
 80 
 81 //All the "runtime" prefixed tables in "main" database contain the current configuration in runtime mode.The other tables can be modified in memory mode.
 82 
 83 (admin@127.0.0.1 6032)[(none)]>show tables from disk;
 84  ------------------------------------ 
 85 | tables                             |
 86  ------------------------------------ 
 87 | global_variables                   |
 88 | mysql_collations                   |
 89 | mysql_group_replication_hostgroups |
 90 | mysql_query_rules                  |
 91 | mysql_query_rules_fast_routing     |
 92 | mysql_replication_hostgroups       |
 93 | mysql_servers                      |
 94 | mysql_users                        |
 95 | proxysql_servers                   |
 96 | scheduler                          |
 97  ------------------------------------ 
 98 10 rows in set (0.00 sec)
 99 
100 //All the tables contains the relevant data which we have saved in disk by "SAVE MYSQL xxx TO DISK;" command.
101 //All the tables have the the same structure as those in "main" database.
102 
103 (admin@127.0.0.1 6032)[(none)]>show tables from monitor;
104  ------------------------------------ 
105 | tables                             |
106  ------------------------------------ 
107 | mysql_server_connect_log           |
108 | mysql_server_group_replication_log |
109 | mysql_server_ping_log              |
110 | mysql_server_read_only_log         |
111 | mysql_server_replication_lag_log   |
112  ------------------------------------ 
113 5 rows in set (0.00 sec)
114 
115 (admin@127.0.0.1 6032)[(none)]>show tables from stats;
116  -------------------------------------- 
117 | tables                               |
118  -------------------------------------- 
119 | global_variables                     |
120 | stats_memory_metrics                 |
121 | stats_mysql_commands_counters        |
122 | stats_mysql_connection_pool          |
123 | stats_mysql_connection_pool_reset    |
124 | stats_mysql_global                   |
125 | stats_mysql_prepared_statements_info |
126 | stats_mysql_processlist              |
127 | stats_mysql_query_digest             |
128 | stats_mysql_query_digest_reset       |
129 | stats_mysql_query_rules              |
130 | stats_mysql_users                    |
131 | stats_proxysql_servers_checksums     |
132 | stats_proxysql_servers_metrics       |
133 | stats_proxysql_servers_status        |
134  -------------------------------------- 
135 15 rows in set (0.00 sec)

 

Verifying that an import was successful:

 

*Summary*

$ hadoop fs -ls EMPLOYEES

2. Configure the separation of reading and writing.

  • *mysqlfrm is a tool of mysql-utilities which is specialized in analyzing .frm files in order to gain the missing structure of tables.*
  • mysqlfrm provides two modes of operation:1. connecting to server with "--server"(defaut mode);2. create a new instance with "--basedir".
  • With the help of parameter "--diagnostic",we can even get information from a .frm file without installing a MySQL server on the host.
  • We cannot get character set and collation information on tables if we forget to use "--server" option.
  • Of course,in order to use transportable tablespace properly,the parameter "innodb_file_per_table=1" is necessary.
  • In my case above,the structure of table about char datatype changed accidently which I'm still baffled with.
  • Also,we can restore these table to any other server like transportable tablespace does.

Found 5 items

 

 

drwxr-xr-x  - someuser somegrp          0 2010-04-27 16:40 /user/someuser/EMPLOYEES/_logs

Configure the hostgroup id.

-rw-r--r--  1 someuser somegrp    2913511 2010-04-27 16:40 /user/someuser/EMPLOYEES/part-m-00000

 1 (admin@127.0.0.1 6032)[main]>show create table mysql_replication_hostgroupsG
 2 *************************** 1. row ***************************
 3        table: mysql_replication_hostgroups
 4 Create Table: CREATE TABLE mysql_replication_hostgroups (
 5     writer_hostgroup INT CHECK (writer_hostgroup>=0) NOT NULL PRIMARY KEY,
 6     reader_hostgroup INT NOT NULL CHECK (reader_hostgroup<>writer_hostgroup AND reader_hostgroup>0),
 7     comment VARCHAR NOT NULL DEFAULT '', UNIQUE (reader_hostgroup))
 8 1 row in set (0.00 sec)
 9 
10 (admin@127.0.0.1 6032)[main]>insert into mysql_replication_hostgroups(writer_hostgroup,reader_hostgroup,comment) values(10,20,'Test of ProxySQL');
11 Query OK, 1 row affected (0.00 sec)
12 
13 (admin@127.0.0.1 6032)[main]>select * from mysql_replication_hostgroups;
14  ------------------ ------------------ ------------------ 
15 | writer_hostgroup | reader_hostgroup | comment          |
16  ------------------ ------------------ ------------------ 
17 | 10               | 20               | Test of ProxySQL |
18  ------------------ ------------------ ------------------ 
19 1 row in set (0.00 sec)
20 
21 (admin@127.0.0.1 6032)[main]>select * from main.runtime_mysql_replication_hostgroups;
22 Empty set (0.00 sec)
23 
24 (admin@127.0.0.1 6032)[main]>select * from disk.mysql_replication_hostgroups;
25 Empty set (0.00 sec)
26 
27 (admin@127.0.0.1 6032)[main]>load mysql servers to runtime;save mysql servers to disk;
28 Query OK, 0 rows affected (0.00 sec)
29 
30 Query OK, 0 rows affected (0.04 sec)
31 
32 (admin@127.0.0.1 6032)[main]>select * from main.runtime_mysql_replication_hostgroups;
33  ------------------ ------------------ ------------------ 
34 | writer_hostgroup | reader_hostgroup | comment          |
35  ------------------ ------------------ ------------------ 
36 | 10               | 20               | Test of ProxySQL |
37  ------------------ ------------------ ------------------ 
38 1 row in set (0.00 sec)
39 
40 (admin@127.0.0.1 6032)[main]>select * from disk.mysql_replication_hostgroups;
41  ------------------ ------------------ ------------------ 
42 | writer_hostgroup | reader_hostgroup | comment          |
43  ------------------ ------------------ ------------------ 
44 | 10               | 20               | Test of ProxySQL |
45  ------------------ ------------------ ------------------ 
46 1 row in set (0.00 sec)

-rw-r--r--  1 someuser somegrp    1683938 2010-04-27 16:40 /user/someuser/EMPLOYEES/part-m-00001

 

-rw-r--r--  1 someuser somegrp    7245839 2010-04-27 16:40 /user/someuser/EMPLOYEES/part-m-00002

Configure the MySQL servers in different hostgroup.

-rw-r--r--  1 someuser somegrp    7842523 2010-04-27 16:40 /user/someuser/EMPLOYEES/part-m-00003

 1 (admin@127.0.0.1 6032)[main]>show create table mysql_serversG
 2 *************************** 1. row ***************************
 3        table: mysql_servers
 4 Create Table: CREATE TABLE mysql_servers (
 5     hostgroup_id INT CHECK (hostgroup_id>=0) NOT NULL DEFAULT 0,
 6     hostname VARCHAR NOT NULL,
 7     port INT NOT NULL DEFAULT 3306,
 8     status VARCHAR CHECK (UPPER(status) IN ('ONLINE','SHUNNED','OFFLINE_SOFT', 'OFFLINE_HARD')) NOT NULL DEFAULT 'ONLINE',
 9     weight INT CHECK (weight >= 0) NOT NULL DEFAULT 1,
10     compression INT CHECK (compression >=0 AND compression <= 102400) NOT NULL DEFAULT 0,
11     max_connections INT CHECK (max_connections >=0) NOT NULL DEFAULT 1000,
12     max_replication_lag INT CHECK (max_replication_lag >= 0 AND max_replication_lag <= 126144000) NOT NULL DEFAULT 0,
13     use_ssl INT CHECK (use_ssl IN(0,1)) NOT NULL DEFAULT 0,
14     max_latency_ms INT UNSIGNED CHECK (max_latency_ms>=0) NOT NULL DEFAULT 0,
15     comment VARCHAR NOT NULL DEFAULT '',
16     PRIMARY KEY (hostgroup_id, hostname, port) )
17 1 row in set (0.00 sec)
18 
19 (admin@127.0.0.1 6032)[main]>insert into mysql_servers(hostgrop_id,hostname,port,max_connections,max_replication_lag,max_latency_ms) values(10,'192.168.1.101',3308,50,10,1000),(20,'192.168.1.102',3308,50,10,1000),(20,'192.168.1.103',3308,50,10,1000);
20 Query OK, 3 row affected (0.01 sec)
21 
22 (admin@127.0.0.1 6032)[main]>select * from mysql_servers;
23  -------------- --------------- ------ -------- -------- ------------- ----------------- --------------------- --------- ---------------- --------- 
24 | hostgroup_id | hostname      | port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
25  -------------- --------------- ------ -------- -------- ------------- ----------------- --------------------- --------- ---------------- --------- 
26 | 10           | 192.168.1.101 | 3308 | ONLINE | 1      | 0           | 50              | 10                  | 0       | 1000           |         |
27 | 20           | 191.168.1.102 | 3308 | ONLINE | 1      | 0           | 50              | 10                  | 0       | 1000           |         |
28 | 20           | 192.168.1.103 | 3308 | ONLINE | 1      | 0           | 50              | 10                  | 0       | 1000           |         |
29  -------------- --------------- ------ -------- -------- ------------- ----------------- --------------------- --------- ---------------- --------- 
30 3 rows in set (0.00 sec)
31 
32 (admin@127.0.0.1 6032)[main]>select * from runtime_mysql_servers;
33 Empty set (0.00 sec)
34 
35 (admin@127.0.0.1 6032)[main]>select * from disk.mysql_servers;
36 Empty set (0.00 sec)
37 
38 (admin@127.0.0.1 6032)[main]>load mysql servers to runtime;save mysql servers to disk;
39 Query OK, 0 rows affected (0.01 sec)
40 
41 Query OK, 0 rows affected (0.03 sec)
42 
43 (admin@127.0.0.1 6032)[main]>select * from runtime_mysql_servers;
44  -------------- --------------- ------ --------- -------- ------------- ----------------- --------------------- --------- ---------------- --------- 
45 | hostgroup_id | hostname      | port | status  | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
46  -------------- --------------- ------ --------- -------- ------------- ----------------- --------------------- --------- ---------------- --------- 
47 | 20           | 191.168.1.102 | 3308 | SHUNNED | 1      | 0           | 50              | 10                  | 0       | 1000           |         |
48 | 20           | 192.168.1.101 | 3308 | ONLINE  | 1      | 0           | 50              | 10                  | 0       | 1000           |         |
49 | 20           | 192.168.1.103 | 3308 | ONLINE  | 1      | 0           | 50              | 10                  | 0       | 1000           |         |
50  -------------- --------------- ------ --------- -------- ------------- ----------------- --------------------- --------- ---------------- --------- 
51 3 rows in set (0.00 sec)
52 
53 (admin@127.0.0.1 6032)[main]>select * from disk.mysql_servers;
54  -------------- --------------- ------ -------- -------- ------------- ----------------- --------------------- --------- ---------------- --------- 
55 | hostgroup_id | hostname      | port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
56  -------------- --------------- ------ -------- -------- ------------- ----------------- --------------------- --------- ---------------- --------- 
57 | 10           | 192.168.1.101 | 3308 | ONLINE | 1      | 0           | 50              | 10                  | 0       | 1000           |         |
58 | 20           | 191.168.1.102 | 3308 | ONLINE | 1      | 0           | 50              | 10                  | 0       | 1000           |         |
59 | 20           | 192.168.1.103 | 3308 | ONLINE | 1      | 0           | 50              | 10                  | 0       | 1000           |         |
60  -------------- --------------- ------ -------- -------- ------------- ----------------- --------------------- --------- ---------------- --------- 
61 3 rows in set (0.00 sec)

$ hadoop fs -cat EMPLOYEES/part-m-00000 | head -n 10

 

0,joe,smith,engineering

Configure the monitor user and product user.

1,jane,doe,marketing

  1 [root@zlm2 09:54:24 /var/lib/proxysql]
  2 #ls -l
  3 total 748
  4 -rw------- 1 root root 122880 Aug 10 09:39 proxysql.db
  5 -rw------- 1 root root 347160 Aug 10 09:47 proxysql.log
  6 -rw-r--r-- 1 root root      5 Aug 10 08:30 proxysql.pid
  7 -rw------- 1 root root 110592 Aug 10 09:46 proxysql_stats.db
  8 -rw-r--r-- 1 root root      2 Aug 10 08:29 reload
  9 
 10 [root@zlm2 09:54:32 /var/lib/proxysql]
 11 #tail proxysql.log
 12 2018-08-10 09:54:30 MySQL_Monitor.cpp:605:monitor_read_only_thread(): [ERROR] Timeout on read_only check for 192.168.1.101:3308 after 0ms. Unable to create a connection. If the server is overload, increase mysql-monitor_connect_timeout. Error: timeout on creating new connection: Access denied for user 'monitor'@'192.168.1.101' (using password: YES).
 13 2018-08-10 09:54:30 MySQL_Monitor.cpp:724:monitor_read_only_thread(): [ERROR] Server 192.168.1.101:3308 missed 3 read_only checks. Assuming read_only=1
 14 2018-08-10 09:54:32 MySQL_Monitor.cpp:605:monitor_read_only_thread(): [ERROR] Timeout on read_only check for 192.168.1.103:3308 after 1ms. Unable to create a connection. If the server is overload, increase mysql-monitor_connect_timeout. Error: timeout on creating new connection: Access denied for user 'monitor'@'192.168.1.101' (using password: YES).
 15 2018-08-10 09:54:32 MySQL_Monitor.cpp:724:monitor_read_only_thread(): [ERROR] Server 192.168.1.103:3308 missed 3 read_only checks. Assuming read_only=1
 16 2018-08-10 09:54:32 MySQL_Monitor.cpp:605:monitor_read_only_thread(): [ERROR] Timeout on read_only check for 192.168.1.101:3308 after 0ms. Unable to create a connection. If the server is overload, increase mysql-monitor_connect_timeout. Error: timeout on creating new connection: Access denied for user 'monitor'@'192.168.1.101' (using password: YES).
 17 2018-08-10 09:54:32 MySQL_Monitor.cpp:724:monitor_read_only_thread(): [ERROR] Server 192.168.1.101:3308 missed 3 read_only checks. Assuming read_only=1
 18 2018-08-10 09:54:33 MySQL_Monitor.cpp:605:monitor_read_only_thread(): [ERROR] Timeout on read_only check for 192.168.1.101:3308 after 0ms. Unable to create a connection. If the server is overload, increase mysql-monitor_connect_timeout. Error: timeout on creating new connection: Access denied for user 'monitor'@'192.168.1.101' (using password: YES).
 19 2018-08-10 09:54:33 MySQL_Monitor.cpp:724:monitor_read_only_thread(): [ERROR] Server 192.168.1.101:3308 missed 3 read_only checks. Assuming read_only=1
 20 2018-08-10 09:54:33 MySQL_Monitor.cpp:605:monitor_read_only_thread(): [ERROR] Timeout on read_only check for 192.168.1.103:3308 after 1ms. Unable to create a connection. If the server is overload, increase mysql-monitor_connect_timeout. Error: timeout on creating new connection: Access denied for user 'monitor'@'192.168.1.101' (using password: YES).
 21 2018-08-10 09:54:33 MySQL_Monitor.cpp:724:monitor_read_only_thread(): [ERROR] Server 192.168.1.103:3308 missed 3 read_only checks. Assuming read_only=1
 22 
 23 [root@zlm2 09:54:18 /data/mysql/mysql3308/data]
 24 #tail error.log 
 25 2018-08-10T09:54:09.757619 01:00 773 [Note] Access denied for user 'monitor'@'192.168.1.101' (using password: YES)
 26 2018-08-10T09:54:11.248093 01:00 774 [Note] Access denied for user 'monitor'@'192.168.1.101' (using password: YES)
 27 2018-08-10T09:54:12.748305 01:00 775 [Note] Access denied for user 'monitor'@'192.168.1.101' (using password: YES)
 28 2018-08-10T09:54:14.253693 01:00 776 [Note] Access denied for user 'monitor'@'192.168.1.101' (using password: YES)
 29 2018-08-10T09:54:15.753818 01:00 777 [Note] Access denied for user 'monitor'@'192.168.1.101' (using password: YES)
 30 2018-08-10T09:54:17.274359 01:00 778 [Note] Access denied for user 'monitor'@'192.168.1.101' (using password: YES)
 31 2018-08-10T09:54:18.754761 01:00 779 [Note] Access denied for user 'monitor'@'192.168.1.101' (using password: YES)
 32 2018-08-10T09:54:19.177448 01:00 780 [Note] Access denied for user 'monitor'@'192.168.1.101' (using password: YES)
 33 2018-08-10T09:54:19.386483 01:00 781 [Note] Access denied for user 'monitor'@'192.168.1.101' (using password: YES)
 34 2018-08-10T09:54:20.270737 01:00 782 [Note] Access denied for user 'monitor'@'192.168.1.101' (using password: YES)
 35 
 36 //The user "monitor" has no privileges to connect in MySQL instance yet.
 37 
 38 //Add monitor user "monitor" in the MySQL instance and grant privileges.
 39 (root@localhost mysql3308.sock)[(none)]>grant all privileges on *.* to monitor@'%' identified by 'monitor';
 40 Query OK, 0 rows affected, 1 warning (0.02 sec)
 41 
 42 (root@localhost mysql3308.sock)[(none)]>select user,host from mysql.user;
 43  --------------- ------------- 
 44 | user          | host        |
 45  --------------- ------------- 
 46 | monitor       | %           |
 47 | repl          | 192.168.1.% |
 48 | zlm           | 192.168.1.% |
 49 | mysql.session | localhost   |
 50 | mysql.sys     | localhost   |
 51 | root          | localhost   |
 52 | sst           | localhost   |
 53  --------------- ------------- 
 54 7 rows in set (0.00 sec)
 55 
 56 [root@zlm2 10:03:06 /var/lib/proxysql]
 57 #tail -20 proxysql.log
 58 2018-08-10 10:00:20 [INFO] New mysql_replication_hostgroups table
 59 writer_hostgroup: 10 , reader_hostgroup: 20, Test of ProxySQL
 60 2018-08-10 10:00:20 [INFO] New mysql_group_replication_hostgroups table
 61 2018-08-10 10:00:20 [INFO] Dumping current MySQL Servers structures for hostgroup ALL
 62 HID: 10 , address: 192.168.1.103 , port: 3308 , weight: 1 , status: ONLINE , max_connections: 50 , max_replication_lag: 10 , use_ssl: 0 , max_latency_ms: 1000000000 , comment: 
 63 HID: 10 , address: 192.168.1.101 , port: 3308 , weight: 1 , status: ONLINE , max_connections: 50 , max_replication_lag: 10 , use_ssl: 0 , max_latency_ms: 1000000000 , comment: 
 64 HID: 20 , address: 191.168.1.102 , port: 3308 , weight: 1 , status: ONLINE , max_connections: 50 , max_replication_lag: 10 , use_ssl: 0 , max_latency_ms: 1000000000 , comment: 
 65 HID: 20 , address: 192.168.1.103 , port: 3308 , weight: 1 , status: ONLINE , max_connections: 50 , max_replication_lag: 10 , use_ssl: 0 , max_latency_ms: 1000000000 , comment: 
 66 HID: 20 , address: 192.168.1.101 , port: 3308 , weight: 1 , status: ONLINE , max_connections: 50 , max_replication_lag: 10 , use_ssl: 0 , max_latency_ms: 1000000000 , comment: 
 67 2018-08-10 10:00:20 [INFO] Dumping mysql_servers
 68  -------------- --------------- ------ -------- -------- ------------- ----------------- --------------------- --------- ---------------- --------- ----------------- 
 69 | hostgroup_id | hostname      | port | weight | status | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment | mem_pointer     |
 70  -------------- --------------- ------ -------- -------- ------------- ----------------- --------------------- --------- ---------------- --------- ----------------- 
 71 | 10           | 192.168.1.103 | 3308 | 1      | 0      | 0           | 50              | 10                  | 0       | 1000           |         | 139639198273536 |
 72 | 20           | 192.168.1.101 | 3308 | 1      | 0      | 0           | 50              | 10                  | 0       | 1000           |         | 139639198867456 |
 73 | 20           | 192.168.1.103 | 3308 | 1      | 0      | 0           | 50              | 10                  | 0       | 1000           |         | 139639212485120 |
 74 | 20           | 191.168.1.102 | 3308 | 1      | 0      | 0           | 50              | 10                  | 0       | 1000           |         | 139639212484992 |
 75 | 10           | 192.168.1.101 | 3308 | 1      | 0      | 0           | 50              | 10                  | 0       | 1000           |         | 139639198412544 |
 76  -------------- --------------- ------ -------- -------- ------------- ----------------- --------------------- --------- ---------------- --------- ----------------- 
 77 2018-08-10 10:00:29 MySQL_Monitor.cpp:1437:monitor_ping(): [ERROR] Server 191.168.1.102:3308 missed 3 heartbeats, shunning it and killing all the connections. Disabling other checks until the node comes back online.
 78 
 79 //Now,the monitor user "monitor" of ProxySQL can get the monitor information.
 80 
 81 //Continue to configure the product user which we need to connect in MySQL instance.
 82 (admin@127.0.0.1 6032)[main]>show create table mysql_usersG
 83 *************************** 1. row ***************************
 84        table: mysql_users
 85 Create Table: CREATE TABLE mysql_users (
 86     username VARCHAR NOT NULL,
 87     password VARCHAR,
 88     active INT CHECK (active IN (0,1)) NOT NULL DEFAULT 1,
 89     use_ssl INT CHECK (use_ssl IN (0,1)) NOT NULL DEFAULT 0,
 90     default_hostgroup INT NOT NULL DEFAULT 0,
 91     default_schema VARCHAR,
 92     schema_locked INT CHECK (schema_locked IN (0,1)) NOT NULL DEFAULT 0,
 93     transaction_persistent INT CHECK (transaction_persistent IN (0,1)) NOT NULL DEFAULT 1,
 94     fast_forward INT CHECK (fast_forward IN (0,1)) NOT NULL DEFAULT 0,
 95     backend INT CHECK (backend IN (0,1)) NOT NULL DEFAULT 1,
 96     frontend INT CHECK (frontend IN (0,1)) NOT NULL DEFAULT 1,
 97     max_connections INT CHECK (max_connections >=0) NOT NULL DEFAULT 10000,
 98     PRIMARY KEY (username, backend),
 99     UNIQUE (username, frontend))
100 1 row in set (0.00 sec)
101 
102 (admin@127.0.0.1 6032)[main]>insert into mysql_users(username,password,active,default_hostgroup,default_schema) values('zlm','zlmzlm',1,20,'zlm');
103 Query OK, 1 row affected (0.00 sec)
104 
105 (admin@127.0.0.1 6032)[main]>select * from mysql_users;
106  ---------- ---------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
107 | username | password | active | use_ssl | default_hostgroup | default_schema | schema_locked | transaction_persistent | fast_forward | backend | frontend | max_connections |
108  ---------- ---------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
109 | zlm      | zlmzlm   | 1      | 0       | 20                | zlm            | 0             | 1                      | 0            | 1       | 1        | 10000           |
110  ---------- ---------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
111 1 row in set (0.00 sec)
112 
113 (admin@127.0.0.1 6032)[main]>load mysql users to runtime;save mysql users to disk;
114 Query OK, 0 rows affected (0.00 sec)
115 
116 Query OK, 0 rows affected (0.01 sec)
117 
118 (admin@127.0.0.1 6032)[main]>select * from runtime_mysql_users;
119  ---------- ------------------------------------------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
120 | username | password                                  | active | use_ssl | default_hostgroup | default_schema | schema_locked | transaction_persistent | fast_forward | backend | frontend | max_connections |
121  ---------- ------------------------------------------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
122 | zlm      | *512FB3FAA6F522E351929CAC70AD8EBB0F5901B6 | 1      | 0       | 20                | zlm            | 0             | 1                      | 0            | 0       | 1        | 10000           |
123 | zlm      | *512FB3FAA6F522E351929CAC70AD8EBB0F5901B6 | 1      | 0       | 20                | zlm            | 0             | 1                      | 0            | 1       | 0        | 10000           |
124  ---------- ------------------------------------------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
125 2 rows in set (0.00 sec)
126 
127 (admin@127.0.0.1 6032)[main]>select * from disk.mysql_users;
128  ---------- ---------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
129 | username | password | active | use_ssl | default_hostgroup | default_schema | schema_locked | transaction_persistent | fast_forward | backend | frontend | max_connections |
130  ---------- ---------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
131 | zlm      | zlmzlm   | 1      | 0       | 20                | zlm            | 0             | 1                      | 0            | 1       | 1        | 10000           |
132  ---------- ---------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
133 1 row in set (0.00 sec)

...

 

Performing an incremental import of new data, after having already imported the first 100,000 rows of a table:

Configure the mysql query rules.

$ sqoop import --connect jdbc:mysql://db.foo.com/somedb --table sometable

 1 (admin@127.0.0.1 6032)[main]>show create table mysql_query_rulesG
 2 *************************** 1. row ***************************
 3        table: mysql_query_rules
 4 Create Table: CREATE TABLE mysql_query_rules (
 5     rule_id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
 6     active INT CHECK (active IN (0,1)) NOT NULL DEFAULT 0,
 7     username VARCHAR,
 8     schemaname VARCHAR,
 9     flagIN INT NOT NULL DEFAULT 0,
10     client_addr VARCHAR,
11     proxy_addr VARCHAR,
12     proxy_port INT,
13     digest VARCHAR,
14     match_digest VARCHAR,
15     match_pattern VARCHAR,
16     negate_match_pattern INT CHECK (negate_match_pattern IN (0,1)) NOT NULL DEFAULT 0,
17     re_modifiers VARCHAR DEFAULT 'CASELESS',
18     flagOUT INT,
19     replace_pattern VARCHAR,
20     destination_hostgroup INT DEFAULT NULL,
21     cache_ttl INT CHECK(cache_ttl > 0),
22     reconnect INT CHECK (reconnect IN (0,1)) DEFAULT NULL,
23     timeout INT UNSIGNED,
24     retries INT CHECK (retries>=0 AND retries <=1000),
25     delay INT UNSIGNED,
26     next_query_flagIN INT UNSIGNED,
27     mirror_flagOUT INT UNSIGNED,
28     mirror_hostgroup INT UNSIGNED,
29     error_msg VARCHAR,
30     OK_msg VARCHAR,
31     sticky_conn INT CHECK (sticky_conn IN (0,1)),
32     multiplex INT CHECK (multiplex IN (0,1,2)),
33     log INT CHECK (log IN (0,1)),
34     apply INT CHECK(apply IN (0,1)) NOT NULL DEFAULT 0,
35     comment VARCHAR)
36 1 row in set (0.00 sec)
37 
38 (admin@127.0.0.1 6032)[main]>insert into mysql_query_rules(active,username,match_pattern,schemaname,destination_hostgroup,apply) values(1,'zlm','^select','zlm',20,1);
39 Query OK, 1 row affected (0.00 sec)
40 
41 (admin@127.0.0.1 6032)[main]>select active,username,match_pattern,schemaname,destination_hostgroup,apply from mysql_query_rules;
42  -------- ---------- --------------- ------------ ----------------------- ------- 
43 | active | username | match_pattern | schemaname | destination_hostgroup | apply |
44  -------- ---------- --------------- ------------ ----------------------- ------- 
45 | 1      | zlm      | ^select       | zlm        | 20                    | 1     |
46  -------- ---------- --------------- ------------ ----------------------- ------- 
47 1 row in set (0.00 sec)
48 
49 (admin@127.0.0.1 6032)[main]>load mysql query rules to runtime;save mysql query rules to disk;
50 Query OK, 0 rows affected (0.00 sec)
51 
52 Query OK, 0 rows affected (0.02 sec)
53 
54 (admin@127.0.0.1 6032)[main]>select active,username,match_pattern,schemaname,destination_hostgroup,apply from runtime_mysql_query_rules;
55  -------- ---------- --------------- ------------ ----------------------- ------- 
56 | active | username | match_pattern | schemaname | destination_hostgroup | apply |
57  -------- ---------- --------------- ------------ ----------------------- ------- 
58 | 1      | zlm      | ^select       | zlm        | 20                    | 1     |
59  -------- ---------- --------------- ------------ ----------------------- ------- 
60 1 row in set (0.00 sec)
61 
62 (admin@127.0.0.1 6032)[main]>select active,username,match_pattern,schemaname,destination_hostgroup,apply from disk.mysql_query_rules;
63  -------- ---------- --------------- ------------ ----------------------- ------- 
64 | active | username | match_pattern | schemaname | destination_hostgroup | apply |
65  -------- ---------- --------------- ------------ ----------------------- ------- 
66 | 1      | zlm      | ^select       | zlm        | 20                    | 1     |
67  -------- ---------- --------------- ------------ ----------------------- ------- 
68 1 row in set (0.00 sec)

--where "id > 100000" --target-dir /incremental_dataset --append

 

An import of a table named EMPLOYEES in the corp database that uses validation to validate the import using the table row count and number of rows copied into HDFS: More Details

3. Test dml operations

$ sqoop import --connect jdbc:mysql://db.foo.com/corp

 

--table EMPLOYEES --validate

Login MySQL Instance with product user 'zlm'.

The import-all-tables tool imports a set of tables from an RDBMS to HDFS. Data from each table is stored in a separate directory in HDFS.

  1 //Connect with port 6033 to use the query rule of ProxySQL.
  2 [root@zlm2 11:13:16 /data/mysql/mysql3308/data]
  3 #mysql -uzlm -pzlmzlm -h192.168.1.101 -P6033
  4 mysql: [Warning] Using a password on the command line interface can be insecure.
  5 Welcome to the MySQL monitor.  Commands end with ; or g.
  6 Your MySQL connection id is 4
  7 Server version: 5.5.30 (ProxySQL)
  8 
  9 Copyright (c) 2009-2018 Percona LLC and/or its affiliates
 10 Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
 11 
 12 Oracle is a registered trademark of Oracle Corporation and/or its
 13 affiliates. Other names may be trademarks of their respective
 14 owners.
 15 
 16 Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
 17 
 18 (zlm@192.168.1.101 6033)[(none)]>show tables; //The default database automatically became zlm although I have't specify which database to use.
 19  --------------- 
 20 | Tables_in_zlm |
 21  --------------- 
 22 | t1            |
 23 | t2            |
 24  --------------- 
 25 2 rows in set (0.00 sec)
 26 
 27 (zlm@192.168.1.101 6033)[(none)]>select * from t1;
 28  ---- ------------ 
 29 | id | name       |
 30  ---- ------------ 
 31 |  1 | MySQL      |
 32 |  2 | Oracle     |
 33 |  3 | PostgreSQL |
 34 |  4 | Redis      |
 35  ---- ------------ 
 36 4 rows in set (0.00 sec)
 37 
 38 (zlm@192.168.1.101 6033)[(none)]>insert into t1 values(5,'MongoDB');
 39 Query OK, 1 row affected (0.01 sec)
 40 
 41 (zlm@192.168.1.101 6033)[(none)]>select * from t1;
 42  ---- ------------ 
 43 | id | name       |
 44  ---- ------------ 
 45 |  1 | MySQL      |
 46 |  2 | Oracle     |
 47 |  3 | PostgreSQL |
 48 |  4 | Redis      |
 49 |  5 | MongoDB    |
 50  ---- ------------ 
 51 5 rows in set (0.00 sec)
 52 
 53 //Check the stats we've catched in "stats_mysql_query_digest" table of "stats" database of ProxySQL.
 54 (admin@127.0.0.1 6032)[main]>use stats;
 55 Reading table information for completion of table and column names
 56 You can turn off this feature to get a quicker startup with -A
 57 
 58 Database changed
 59 (admin@127.0.0.1 6032)[stats]>show tables; //Notice,if you don't specify "from xxx",it will show tables of "main" database.
 60  -------------------------------------------- 
 61 | tables                                     |
 62  -------------------------------------------- 
 63 | global_variables                           |
 64 | mysql_collations                           |
 65 | mysql_group_replication_hostgroups         |
 66 | mysql_query_rules                          |
 67 | mysql_query_rules_fast_routing             |
 68 | mysql_replication_hostgroups               |
 69 | mysql_servers                              |
 70 | mysql_users                                |
 71 | proxysql_servers                           |
 72 | runtime_checksums_values                   |
 73 | runtime_global_variables                   |
 74 | runtime_mysql_group_replication_hostgroups |
 75 | runtime_mysql_query_rules                  |
 76 | runtime_mysql_query_rules_fast_routing     |
 77 | runtime_mysql_replication_hostgroups       |
 78 | runtime_mysql_servers                      |
 79 | runtime_mysql_users                        |
 80 | runtime_proxysql_servers                   |
 81 | runtime_scheduler                          |
 82 | scheduler                                  |
 83  -------------------------------------------- 
 84 20 rows in set (0.00 sec)
 85 
 86 (admin@127.0.0.1 6032)[stats]>show tables from stats; //Speicify the target database to be "stats".
 87  -------------------------------------- 
 88 | tables                               |
 89  -------------------------------------- 
 90 | global_variables                     |
 91 | stats_memory_metrics                 |
 92 | stats_mysql_commands_counters        |
 93 | stats_mysql_connection_pool          |
 94 | stats_mysql_connection_pool_reset    |
 95 | stats_mysql_global                   |
 96 | stats_mysql_prepared_statements_info |
 97 | stats_mysql_processlist              |
 98 | stats_mysql_query_digest             |
 99 | stats_mysql_query_digest_reset       |
100 | stats_mysql_query_rules              |
101 | stats_mysql_users                    |
102 | stats_proxysql_servers_checksums     |
103 | stats_proxysql_servers_metrics       |
104 | stats_proxysql_servers_status        |
105  -------------------------------------- 
106 15 rows in set (0.00 sec)
107 
108 (admin@127.0.0.1 6032)[stats]>select hostgroup,schemaname,username,substr(digest_text,120,-120),count_star from stats_mysql_query_digest;
109  ----------- ------------ ---------- ---------------------------------- ------------ 
110 | hostgroup | schemaname | username | substr(digest_text,120,-120)     | count_star |
111  ----------- ------------ ---------- ---------------------------------- ------------ 
112 | 20        | zlm        | zlm      | select * from t1                 | 2          |
113 | 20        | zlm        | zlm      | insert into t1 values(?,?)       | 1          |
114 | 20        | zlm        | zlm      | select USER()                    | 2          |
115 | 20        | zlm        | zlm      | show tables                      | 1          |
116 | 20        | zlm        | zlm      | select @@version_comment limit ? | 2          |
117  ----------- ------------ ---------- ---------------------------------- ------------ 
118 5 rows in set (0.00 sec)
119 
120 //It's due to I've set "default_hostgroup" with "20".Actually it should be set with "10" of writer_hostgroup.
121 
122 (admin@127.0.0.1 6032)[stats]>select * from mysql_users;
123  ---------- ---------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
124 | username | password | active | use_ssl | default_hostgroup | default_schema | schema_locked | transaction_persistent | fast_forward | backend | frontend | max_connections |
125  ---------- ---------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
126 | zlm      | zlmzlm   | 1      | 0       | 20                | zlm            | 0             | 1                      | 0            | 1       | 1        | 10000           |
127  ---------- ---------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
128 1 row in set (0.00 sec)
129 
130 (admin@127.0.0.1 6032)[stats]>update mysql_users set default_hostgroup=10;
131 Query OK, 1 row affected (0.00 sec)
132 
133 (admin@127.0.0.1 6032)[stats]>select * from mysql_users;
134  ---------- ---------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
135 | username | password | active | use_ssl | default_hostgroup | default_schema | schema_locked | transaction_persistent | fast_forward | backend | frontend | max_connections |
136  ---------- ---------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
137 | zlm      | zlmzlm   | 1      | 0       | 10                | zlm            | 0             | 1                      | 0            | 1       | 1        | 10000           |
138  ---------- ---------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
139 1 row in set (0.00 sec)
140 
141 (admin@127.0.0.1 6032)[stats]>load mysql users to runtime;save mysql users to disk;
142 Query OK, 0 rows affected (0.00 sec)
143 
144 Query OK, 0 rows affected (0.05 sec)
145 
146 (admin@127.0.0.1 6032)[stats]>select  * from runtime_mysql_users;
147  ---------- ------------------------------------------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
148 | username | password                                  | active | use_ssl | default_hostgroup | default_schema | schema_locked | transaction_persistent | fast_forward | backend | frontend | max_connections |
149  ---------- ------------------------------------------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
150 | zlm      | *512FB3FAA6F522E351929CAC70AD8EBB0F5901B6 | 1      | 0       | 10                | zlm            | 0             | 1                      | 0            | 0       | 1        | 10000           |
151 | zlm      | *512FB3FAA6F522E351929CAC70AD8EBB0F5901B6 | 1      | 0       | 10                | zlm            | 0             | 1                      | 0            | 1       | 0        | 10000           |
152  ---------- ------------------------------------------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
153 2 rows in set (0.00 sec)
154 
155 (admin@127.0.0.1 6032)[stats]>select  * from disk.mysql_users;
156  ---------- ---------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
157 | username | password | active | use_ssl | default_hostgroup | default_schema | schema_locked | transaction_persistent | fast_forward | backend | frontend | max_connections |
158  ---------- ---------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
159 | zlm      | zlmzlm   | 1      | 0       | 10                | zlm            | 0             | 1                      | 0            | 1       | 1        | 10000           |
160  ---------- ---------- -------- --------- ------------------- ---------------- --------------- ------------------------ -------------- --------- ---------- ----------------- 
161 1 row in set (0.00 sec)
162 
163 //Clear the statistic data of table "stats_mysql_query_digest".
164 (admin@127.0.0.1 6032)[stats]>select * from stats_mysql_query_digest_reset;
165  ----------- ------------ ---------- -------------------- ---------------------------------- ------------ ------------ ------------ ---------- ---------- ---------- 
166 | hostgroup | schemaname | username | digest             | digest_text                      | count_star | first_seen | last_seen  | sum_time | min_time | max_time |
167  ----------- ------------ ---------- -------------------- ---------------------------------- ------------ ------------ ------------ ---------- ---------- ---------- 
168 | 20        | zlm        | zlm      | 0x3765930C7143F468 | select * from t1                 | 2          | 1533892415 | 1533892545 | 828      | 204      | 624      |
169 | 20        | zlm        | zlm      | 0x3CE4C46484576DFD | insert into t1 values(?,?)       | 1          | 1533892538 | 1533892538 | 13359    | 13359    | 13359    |
170 | 20        | zlm        | zlm      | 0x594F2C744B698066 | select USER()                    | 2          | 1533892377 | 1533892397 | 0        | 0        | 0        |
171 | 20        | zlm        | zlm      | 0x99531AEFF718C501 | show tables                      | 1          | 1533892400 | 1533892400 | 2607     | 2607     | 2607     |
172 | 20        | zlm        | zlm      | 0x226CD90D52A2BA0B | select @@version_comment limit ? | 2          | 1533892377 | 1533892397 | 0        | 0        | 0        |
173  ----------- ------------ ---------- -------------------- ---------------------------------- ------------ ------------ ------------ ---------- ---------- ---------- 
174 5 rows in set (0.01 sec)
175 
176 (admin@127.0.0.1 6032)[stats]>select * from stats_mysql_query_digest;
177 Empty set (0.00 sec)
178 
179 //Do the dml operations again.
180 (zlm@192.168.1.101 6033)[(none)]>show tables;
181  --------------- 
182 | Tables_in_zlm |
183  --------------- 
184 | t1            |
185 | t2            |
186  --------------- 
187 2 rows in set (0.00 sec)
188 
189 (zlm@192.168.1.101 6033)[(none)]>select * from t2;
190  ---- ------------ 
191 | id | name       |
192  ---- ------------ 
193 |  1 | MySQL      |
194 |  2 | Oracle     |
195 |  3 | SQL Server |
196 |  4 | Redis      |
197 |  5 | MongoDB    |
198 |  6 | PostgreSQL |
199  ---- ------------ 
200 6 rows in set (0.01 sec)
201 
202 (zlm@192.168.1.101 6033)[(none)]>insert into t2 values(7,'Hadoop');
203 Query OK, 1 row affected (0.01 sec)
204 
205 //Check the diagnostic data in table "stats_mysql_query_digest" again.
206 (admin@127.0.0.1 6032)[stats]>select active,username,match_pattern,schemaname,destination_hostgroup,apply from mysql_query_rules;
207  -------- ---------- --------------- ------------ ----------------------- ------- 
208 | active | username | match_pattern | schemaname | destination_hostgroup | apply |
209  -------- ---------- --------------- ------------ ----------------------- ------- 
210 | 1      | zlm      | ^select       | zlm        | 20                    | 1     |
211  -------- ---------- --------------- ------------ ----------------------- ------- 
212 1 row in set (0.00 sec)
213 
214 (admin@127.0.0.1 6032)[stats]>select hostgroup,schemaname,username,substr(digest_text,120,-120),count_star from stats_mysql_query_digest;
215  ----------- ------------ ---------- ------------------------------ ------------ 
216 | hostgroup | schemaname | username | substr(digest_text,120,-120) | count_star |
217  ----------- ------------ ---------- ------------------------------ ------------ 
218 | 20        | zlm        | zlm      | insert into t2 values(?,?)   | 1          |
219 | 20        | zlm        | zlm      | select * from t2             | 1          |
220 | 20        | zlm        | zlm      | show tables                  | 1          |
221  ----------- ------------ ---------- ------------------------------ ------------ 
222 3 rows in set (0.00 sec)
223 
224 //It still doesn't take effect.What's wrong with it?

For the import-all-tables tool to be useful, the following conditions must be met:

 

Each table must have a single-column primary key or --autoreset-to-one-mapper option must be used.

You must intend to import all columns of each table.

You must not intend to use non-default splitting column, nor impose any conditions via a WHERE clause.

These arguments behave in the same manner as they do when used for the sqoop-import tool, but the --table, --split-by, --columns, and --where arguments are invalid for sqoop-import-all-tables. The --exclude-tables argument is for sqoop-import-all-tables only.

8.3. Example Invocations

Import all tables from the corp database:

$ sqoop import-all-tables --connect jdbc:mysql://db.foo.com/corp

Verifying that it worked:

$ hadoop fs -ls

Found 4 items

drwxr-xr-x  - someuser somegrp      0 2010-04-27 17:15 /user/someuser/EMPLOYEES

drwxr-xr-x  - someuser somegrp      0 2010-04-27 17:15 /user/someuser/PAYCHECKS

drwxr-xr-x  - someuser somegrp      0 2010-04-27 17:15 /user/someuser/DEPARTMENTS

drwxr-xr-x  - someuser somegrp      0 2010-04-27 17:15 /user/someuser/OFFICE_SUPPLIES

The export tool exports a set of files from HDFS back to an RDBMS. The target table must already exist in the database. The input files are read and parsed into a set of records according to the user-specified delimiters.

The default operation is to transform these into a set of INSERT statements that inject the records into the database. In "update mode," Sqoop will generate UPDATE statements that replace existing records in the database, and in "call mode" Sqoop will make a stored procedure call for each record.

Although the Hadoop generic arguments must preceed any export arguments, the export arguments can be entered in any order with respect to one another.参数无顺序

Sqoop supports additional import targets beyond HDFS and Hive. Sqoop can also import records into a table in HBase.

By specifying --hbase-table, you instruct Sqoop to import to a table in HBase rather than a directory in HDFS. Sqoop will import data to the table specified as the argument to --hbase-table. Each row of the input table will be transformed into an HBase Put operation to a row of the output table. The key for each row is taken from a column of the input. By default Sqoop will use the split-by column as the row key column. If that is not specified, it will try to identify the primary key column, if any, of the source table. You can manually specify the row key column with --hbase-row-key. Each output column will be placed in the same column family, which must be specified with --column-family.

[Note] Note

This function is incompatible with direct import (parameter --direct).

This function is incompatible with direct import (parameter --direct), and cannot be used in the same operation as an HBase import.

The --export-dir argument and one of --table or --call are required. These specify the table to populate in the database (or the stored procedure to call), and the directory in HDFS that contains the source data.

By default, all columns within a table are selected for export. You can select a subset of columns and control their ordering by using the --columns argument. This should include a comma-delimited list of columns to export. For example: --columns "col1,col2,col3". Note that columns that are not included in the --columns parameter need to have either defined default value or allow NULL values. Otherwise your database will reject the imported data which in turn will make Sqoop job fail. 导出数据时  未选用的列在数据库表中需求为可空类型 不然数据库将拒绝接受数据导入

By default, Sqoop will use four tasks in parallel for the export process. This may not be optimal; you will need to experiment with your own particular setup. Additional tasks may offer better concurrency, but if the database is already bottlenecked on updating indices, invoking triggers, and so on, then additional load may decrease performance. The --num-mappers or -m arguments control the number of map tasks, which is the degree of parallelism used.能够扩大map数,可是当数据库遭遇品质瓶颈时,扩大map反而会稳中有降质量。

If --input-null-string is not specified, then the string "null" will be interpreted as null for string-type columns. If --input-null-non-string is not specified, then both the string "null" and the empty string will be interpreted as null for non-string columns. Note that, the empty string will be always interpreted as null for non-string columns, in addition to other string if specified by --input-null-non-string.导出多少时,由于非空列的null数据导出至表将会被封堵职责

Since Sqoop breaks down export process into multiple transactions, it is possible that a failed export job may result in partial data being committed to the database. This can further lead to subsequent jobs failing due to insert collisions in some cases, or lead to duplicated data in others. You can overcome this problem by specifying a staging table via the --staging-table option which acts as an auxiliary table that is used to stage exported data. The staged data is finally moved to the destination table in a single transaction.导出职务被分成多少个部分 多事务 在那之中某些谬误将形成唯有部分数据被导入至表  若是运用--staging-table 将数据保存至有时表内  最后作为纯粹业务将有时表数据导入至目的表

Support for staging data prior to pushing it into the destination table is not always available for --direct exports. It is also not available when export is invoked using the --update-key option for updating existing data, and when stored procedures are used to insert the data.

By default, sqoop-export appends new rows to a table; each input record is transformed into an INSERT statement that adds a row to the target database table.

By default, sqoop-export appends new rows to a table; each input record is transformed into an INSERT statement that adds a row to the target database table. If your table has constraints (e.g., a primary key column whose values must be unique) and already contains data, you must take care to avoid inserting records that violate these constraints. The export process will fail if an INSERT statement fails. This mode is primarily intended for exporting records to a new, empty table intended to receive these results.

If you specify the --update-key argument, Sqoop will instead modify an existing dataset in the database. Each input record is treated as an UPDATE statement that modifies an existing row. The row a statement modifies is determined by the column name(s) specified with --update-key. For example, consider the following table definition:

Depending on the target database, you may also specify the --update-mode argument with allowinsert mode if you want to update rows if they exist in the database already or insert rows if they do not exist yet.

If an UPDATE statement modifies no rows, this is not considered an error; the export will silently continue. (In effect, this means that an update-based export will not insert new rows into the database.) Likewise, if the column specified with --update-key does not uniquely identify rows and multiple rows are updated by a single statement, this condition is also undetected.

The argument --update-key can also be given a comma separated list of column names. In which case, Sqoop will match all keys from this list before updating any existing record.

Exports are performed by multiple writers in parallel. Each writer uses a separate connection to the database; these have separate transactions from one another. Sqoop uses the multi-row INSERT syntax to insert up to 100 records per statement. Every 100 statements, the current transaction within a writer task is committed, causing a commit every 10,000 rows. This ensures that transaction buffers do not grow without bound, and cause out-of-memory conditions. Therefore, an export is not an atomic process. Partial results from the export will become visible before the export is complete. 二回导出职务并行,每八个写操作都以三个单独的作业。注意职业缓存的提升不能够超过警戒线,以及内部存储器溢出的清空。因此,二回导出是一件原子事务。恐怕在数量尚未完全导入从前就足以在表中看见。

Exports may fail for a number of reasons: 导出任务恐怕倒闭的来头

Loss of connectivity from the Hadoop cluster to the database (either due to hardware fault, or server software crashes) 失去连接

Attempting to INSERT a row which violates a consistency constraint (for example, inserting a duplicate primary key value) 违反一致性约束插入

Attempting to parse an incomplete or malformed record from the HDFS source data  分析一个不完全的或残缺的记录

Attempting to parse records using incorrect delimiters 使用不得法的剪切符深入分析记录

Capacity issues (such as insufficient RAM or disk space) 体积难题  内部存款和储蓄器不足或磁盘空间不足

If an export map task fails due to these or other reasons, it will cause the export job to fail. The results of a failed export are undefined. Each export map task operates in a separate transaction. Furthermore, individual map tasks commit their current transaction periodically. If a task fails, the current transaction will be rolled back. Any previously-committed transactions will remain durable in the database, leading to a partially-complete export.

每三个export map任务都在单独的事情中进行学业,互相独立的付出他们各自的任务。要是任务战败,当前事务会回滚而原先已做到的专门的职业将继续封存在数据库内,就能够导致部分成功的导出职责。

A basic export to populate a table named bar:

$ sqoop export --connect jdbc:mysql://db.example.com/foo --table bar 

--export-dir /results/bar_data

This example takes the files in /results/bar_data and injects their contents in to the bar table in the foo database on db.example.com. The target table must already exist in the database. Sqoop performs a set of INSERT INTO operations, without regard for existing content. If Sqoop attempts to insert rows which violate constraints in the database (for example, a particular primary key value already exists), then the export fails.

插入操作  保证表已存在 违反一致性约束 将战败

Alternatively, you can specify the columns to be exported by providing --columns "col1,col2,col3". Please note that columns that are not included in the --columns parameter need to have either defined default value or allow NULL values. Otherwise your database will reject the imported data which in turn will make Sqoop job fail.  那三个未被增选导出数据的列  在数据库中或许具有暗中认可值 要么允许null值 不然将会停业

Another basic export to populate a table named bar with validation enabled: More Details

$ sqoop export --connect jdbc:mysql://db.example.com/foo --table bar 

--export-dir /results/bar_data --validate

An export that calls a stored procedure named barproc for every record in /results/bar_data would look like:  导出多少时调用存款和储蓄进度

$ sqoop export --connect jdbc:mysql://db.example.com/foo --call barproc

--export-dir /results/bar_data

Validation

Validate the data copied, either import or export by comparing the row counts from the source and the target post copy.  相比源表和目的表的行数  验证操作成效

Imports and exports can be repeatedly performed by issuing the same command multiple times. Especially when using the incremental import capability, this is an expected scenario.

Sqoop allows you to define saved jobs which make this process easier. A saved job records the configuration information required to execute a Sqoop command at a later time. The section on the sqoop-job tool describes how to create and work with saved jobs.

By default, job descriptions are saved to a private repository stored in $HOME/.sqoop/. You can configure Sqoop to instead use a shared metastore, which makes saved jobs available to multiple users across a shared cluster. Starting the metastore is covered by the section on the sqoop-metastore tool.

行使sqoop job 定义多次import export

The job tool allows you to create and work with saved jobs. Saved jobs remember the parameters used to specify a job, so they can be re-executed by invoking the job by its handle.

If a saved job is configured to perform an incremental import, state regarding the most recently imported rows is updated in the saved job to allow the job to continually import only the newest rows.

job  重复实践  增量导入时 允许新添多少持续性参加当前文件

Creating saved jobs is done with the --create action. This operation requires a -- followed by a tool name and its arguments. The tool and its arguments will form the basis of the saved job. Consider:

$ sqoop job --create myjob -- import --connect jdbc:mysql://example.com/db

--table mytable

This creates a job named myjob which can be executed later. The job is not run. This job is now available in the list of saved jobs:

$ sqoop job --list

DB设计工具,基于Xtrabackup恢复单个innodb表。Available jobs:

myjob

We can inspect the configuration of a job with the show action:

$ sqoop job --show myjob

Job: myjob

Tool: import

Options:


direct.import = false

codegen.input.delimiters.record = 0

hdfs.append.dir = false

db.table = mytable

...

And if we are satisfied with it, we can run the job with exec:

$ sqoop job --exec myjob

10/08/19 13:08:45 INFO tool.CodeGenTool: Beginning code generation

...

The exec action allows you to override arguments of the saved job by supplying them after a --. For example, if the database were changed to require a username, we could specify the username and password with:

$ sqoop job --exec myjob -- --username someuser -P

Enter password:

...

Incremental imports are performed by comparing the values in a check column against a reference value for the most recent import. For example, if the --incremental append argument was specified, along with --check-column id and --last-value 100, all rows with id > 100 will be imported. If an incremental import is run from the command line, the value which should be specified as --last-value in a subsequent incremental import will be printed to the screen for your reference. If an incremental import is run from a saved job, this value will be retained in the saved job. Subsequent runs of sqoop job --exec someIncrementalJob will continue to import only newer rows than those previously imported.

增量导入 --incremental append    --check-column id      --last-value 100

The metastore tool configures Sqoop to host a shared metadata repository. Multiple users and/or remote users can define and execute saved jobs (created with sqoop job) defined in this metastore.

Clients must be configured to connect to the metastore in sqoop-site.xml or with the --meta-connect argument.

The metastore is available over TCP/IP. The port is controlled by the sqoop.metastore.server.port configuration parameter, and defaults to 16000.

The merge tool runs a MapReduce job that takes two directories as input: a newer dataset, and an older one. These are specified with --new-data and --onto respectively. The output of the MapReduce job will be placed in the directory in HDFS specified by --target-dir. 合併数据集时钦点新旧数据集且钦命指标目录

When merging the datasets, it is assumed that there is a unique primary key value in each record. The column for the primary key is specified with --merge-key. Multiple rows in the same dataset should not have the same primary key, or else data loss may occur. 合併时保障一致性约束  保障主键的独一性

The codegen tool generates Java classes which encapsulate and interpret imported records. The Java definition of a record is instantiated as part of the import process, but can also be performed separately. For example, if Java source is lost, it can be recreated. New versions of a class can be created which use different delimiters between 田野s, and so on.  自动生成java类文件

Recreate the record interpretation code for the employees table of a corporate database:

$ sqoop codegen --connect jdbc:mysql://db.example.com/corp

--table employees

The create-hive-table tool populates a Hive metastore with a definition for a table based on a database table previously imported to HDFS, or one planned to be imported. This effectively performs the "--hive-import" step of sqoop-import without running the preceeding import.

If data was already loaded to HDFS, you can use this tool to finish the pipeline of importing the data to Hive. You can also create Hive tables with this tool; data then can be imported and populated into the target after a preprocessing step run by the user.

Define in Hive a table named emps with a definition based on a database table named employees:

$ sqoop create-hive-table --connect jdbc:mysql://db.example.com/corp

--table employees --hive-table emps

The eval tool allows users to quickly run simple SQL queries against a database; results are printed to the console. This allows users to preview their import queries to ensure they import the data they expect.

The eval tool is provided for evaluation purpose only. You can use it to verify database connection from within the Sqoop or to test simple queries. It’s not suppose to be used in production workflows.

行使eval急迅轻松评估sql 查询    仅仅用于评估目的    验证数据库链接等指标  在生育专门的学业流中并不帮助

Select ten records from the employees table:

$ sqoop eval --connect jdbc:mysql://db.example.com/corp

DB设计工具,基于Xtrabackup恢复单个innodb表。--query "SELECT * FROM employees LIMIT 10"

Insert a row into the foo table:

$ sqoop eval --connect jdbc:mysql://db.example.com/corp

-e "INSERT INTO foo VALUES(42, 'bar')"

List database schemas available on a MySQL server:

$ sqoop list-databases --connect jdbc:mysql://database.example.com/

information_schema

employees

This only works with HSQLDB, MySQL and Oracle. When using with Oracle, it is necessary that the user connecting to the database has DBA privileges.

Netezza connector supports an optimized data transfer facility using the Netezza external tables feature. Each map tasks of Netezza connector’s import job will work on a subset of the Netezza partitions and transparently create and use an external table to transport data. Similarly, export jobs will use the external table to push data fast onto the NZ system. Direct mode does not support staging tables, upsert options etc.

Here is an example of complete command line for import using the Netezza external table feature.

$ sqoop import

--direct

--connect jdbc:netezza://nzhost:5480/sqoop

--table nztable

--username nzuser

--password nzpass

--target-dir hdfsdir

Here is an example of complete command line for export with tab as the field terminator character.

$ sqoop export

--direct

--connect jdbc:netezza://nzhost:5480/sqoop

--table nztable

--username nzuser

--password nzpass

--export-dir hdfsdir

--input-fields-terminated-by "t"

Netezza direct connector supports the null-string features of Sqoop. The null string values are converted to appropriate external table options during export and import operations.

本文由澳门新萄京官方网站发布于数据库网络,转载请注明出处:DB设计工具,基于Xtrabackup恢复单个innodb表

关键词: