Write Doris data to OmniFabric using DataX

This article describes how to write Doris data offline to a OmniFabric database using the DataX tool.

Prepare before you start

Before you can start writing data to OmniFabric using DataX, you need to complete the installation of the following software:

Steps

Creating Test Data in Doris

create database test;

use test;

CREATE TABLE IF NOT EXISTS example_tbl
(
    user_id BIGINT NOT NULL COMMENT "user id",
    date DATE NOT NULL COMMENT "data insertion date time",
    city VARCHAR(20) COMMENT "user city",
    age SMALLINT COMMENT "user age",
    sex TINYINT COMMENT "user gender"
)
DUPLICATE KEY(user_id, date)
DISTRIBUTED BY HASH(user_id) BUCKETS 1
PROPERTIES (
    "replication_num"="1"
);

insert into example_tbl values
(10000,'2017-10-01','Beijing',20,0),
(10000,'2017-10-01','Beijing',20,0),
(10001,'2017-10-01','Beijing',30,1),
(10002,'2017-10-02','Shanghai',20,1),
(10003,'2017-10-02','Guangzhou',32,0),
(10004,'2017-10-01','Shenzhen',35,0),
(10004,'2017-10-03','Shenzhen',35,0);

Creating a Target Library Table in OmniFabric

create database sparkdemo;
use sparkdemo;

CREATE TABLE IF NOT EXISTS example_tbl
(
    user_id BIGINT NOT NULL COMMENT "user id",
    date DATE NOT NULL COMMENT "data insertion date time",
    city VARCHAR(20) COMMENT "user city",
    age SMALLINT COMMENT "user age",
    sex TINYINT COMMENT "user gender"
);

Edit the json template file for datax

Go to the datax/job path and fill in the following at doris2mo.json

{
  "job": {
    "setting": {
  "speed": {
    "channel": 8
  }
    },
    "content": [
  {
    "reader": {
      "name": "mysqlreader",
      "parameter": {
        "username": "root",
        "password": "root",
        "splitPk": "user_id",
        "column": [
          '*'
        ],
        "connection": [
          {
            "table": [
              "example_tbl"
            ],
            "jdbcUrl": [
              "jdbc:mysql://xx.xx.xx.xx:9030/test"
            ]
          }
        ],
        "fetchSize": 1024
      }
    },
    "writer": {
      "name": "OmniFabricwriter",
      "parameter": {
        "writeMode": "insert",
        "username": "root",
        "password": "111",
        "column": [
          '*'
        ],
        "connection": [
          {
            "jdbcUrl": "jdbc:mysql://xx.xx.xx.xx:6001/sparkdemo",
            "table": [
              "example_tbl"
            ]
          }
        ]
      }
    }
  }
    ]
  }
}

Start the datax job

python bin/datax.py job/doris2mo.json

The following results are displayed:

2024-04-28 15:47:38.222 [job-0] INFO  JobContainer -
Task Start Time                  : 2024-04-28 15:47:26
Task End Time                    : 2024-04-28 15:47:38
Total Task Time                  :                 11s
Average Task Throughput          :               12B/s
Record Write Speed               :              0rec/s
Total Records Read               :                   7
Total Read/Write Failures        :                   0