expdp user/password directory=datapump dumpfile=oracle_example.dmp content=metadata_only logfile=example.logimpdp user/password directory=datapump dumpfile=oracle_example.dmp sqlfile=example_sql.sql TRANSFORM=SEGMENT_ATTRIBUTES:N:table PARTITION_OPTIONS=MERGE您需要将每个表的数据提取到数据列表文件(.lst)中。我们编写了一个简单的SQL脚本(spool.sql)来执行此操作:
cat spool.sqlSET ECHO OFF
SET TERMOUT OFF
SET FEEDBACK OFF
SET PAGESIZE 0
SET TRIMSPOOL ON
SET WRAP OFF
set linesize 30000
SET RECSEP OFF
SET VERIFY OFF
SET ARRAYSIZE 10000
SET COLSEP '|'
SPOOL '&1'
ALTER SESSION SET nls_date_format = 'YYYY-MM-DD HH24:MI:SS';
SELECT * from &1;
SPOOL OFF
SET PAGESIZE 24
SET FEEDBACK ON
SET TERMOUT ON要提取数据,我们为SQL * Plus中的每个表运行了脚本:
$ sqlplus user/password@spool CUSTOMERS
@spool ADDRESSES
@spool CARD_DETAILS
@spool WAREHOUSES
@spool ORDER_ITEMS
@spool ORDERS
@spool INVENTORIES
@spool PRODUCT_INFORMATION
@spool LOGON
@spool PRODUCT_DESCRIPTIONS
@spool ORDERENTRY_METADATA.lst为每个表创建一个带有前导和尾随空格的数据列表文件()。
退出SQL * Plus:
>EXIT每个表的数据列表文件都需要转换为CSV并针对hubble进行格式化。我们编写了一个简单的Python脚本(fix-example.py)来执行此操作:
cat fix-example.pyimport csv
import string
import sys
for lstfile in sys.argv[1:]:
filename = lstfile.split(".")[0]
with open(sys.argv[1]) as f:
reader = csv.reader(f, delimiter="|")
with open(filename+".csv", "w") as fo:
writer = csv.writer(fo)
for rec in reader:
writer.writerow(map(string.strip, rec))python fix-example.py CUSTOMERS.lst ADDRESSES.lst CARD_DETAILS.lst WAREHOUSES.lst ORDER_ITEMS.lst ORDERS.lst INVENTORIES.lst PRODUCT_INFORMATION.lst LOGON.lst PRODUCT_DESCRIPTIONS.lst ORDERENTRY_METADATA.lst使用之前创建的SQL文件,编写IMPORT TABLE与要导入的表数据的模式匹配的语句。
删除所有特定于Oracle的属性,重新映射所有Oracle数据类型,重构所有CREATE TABLE语句以包括主键。
使用下表进行数据类型映射:
| Oracle数据类型 | hubble数据类型 |
|---|---|
| BLOB | BYTES 1个 |
| CHAR(n),CHARACTER(n)n <256 | CHAR(n),CHARACTER(n) |
| CLOB | STRING 1个 |
| DATE | DATE |
| FLOAT(n) | DECIMAL(n) |
| INTERVAL YEAR(p) TO MONTH | VARCHAR, INTERVAL |
| INTERVAL DAY(p) TO SECOND(s) | VARCHAR, INTERVAL |
| JSON | JSON 2 |
| LONG | STRING |
| LONG RAW | BYTES |
| NCHAR(n)n <256 | CHAR(n) |
| NCHAR(n)n> 255 | VARCHAR, STRING |
| NCLOB | STRING |
| NUMBER(p,0),NUMBER(p)1 <= p <5 | INT2 3 |
| NUMBER(p,0),NUMBER(p)5 <= p <9 | INT4 3 |
| NUMBER(p,0),NUMBER(p)9 <= p <19 | INT8 3 |
| NUMBER(p,0),NUMBER(p)19 <= p <= 38 | DECIMAL(p) |
| NUMBER(p,s) s> 0 | DECIMAL(p,s) |
| NUMBER, NUMBER(*) | DECIMAL |
| NVARCHAR2(n) | VARCHAR(n) |
| RAW(n) | BYTES |
| TIMESTAMP(p) | TIMESTAMP |
| TIMESTAMP(p) WITH TIME ZONE | TIMESTAMP WITH TIMEZONE |
| VARCHAR(n), VARCHAR2(n) | VARCHAR(n) |
| XML | JSON 2 |
如同CSV数据迁移一致