Data processing tools serve as the core bridge connecting raw data to OceanaBase Database. This section provides an overview of the main tools.
OceanBase Developer Center (ODC)
OceanBase Developer Center (ODC) is a GUI-based database development tool and a collaborative management platform for data development, production, and changes. ODC is available in two forms: client ODC and web ODC. Client ODC focuses on database development and is lightweight and easy to deploy on Windows, macOS, and Linux. Web ODC focuses on database changes, ensuring their security, compliance, and efficiency, and provides collaborative management capabilities.
For more information about how to use ODC, see OceanBase Developer Center (ODC).
ETL tools in the Ecological ETL solution
Flink
Flink is an open-source framework for large-scale data processing and analysis. Flink CDC is a component that runs on Flink and captures database change events. Together, they provide a powerful real-time data processing and analysis solution.
For more information about how to use Flink, see Use Flink CDC to synchronize data from a MySQL database to OceanBase Database and Use Flink CDC to migrate data from OceanBase Database to a MySQL database.
dbt (Data Build Tool)
dbt is an open-source data transformation tool that allows you to convert data into tables or views using SQL. This topic describes how to use dbt-oceanbase to analyze data in OceanBase Database.
For more information about how to use dbt, see Analyze data in OceanBase Database by using dbt.
DataWorks
DataWorks is a data development and governance platform on Alibaba Cloud. It provides end-to-end processing capabilities for data warehouses, data lakes, and integrated lakes based on big data engines such as MaxCompute, Hologres, EMR, AnalyticDB, and CDP. DataWorks supports data integration, data development, data modeling, and data analysis. For more information about how to use DataWorks, see DataWorks.
AWS Glue
AWS Glue is an serverless data integration service that helps you easily discover, prepare, move, and integrate data from multiple sources to support analysis, machine learning, and application development. It provides comprehensive data set integration capabilities, including data discovery, modern ETL, data cleaning and transformation, and centralized cataloging, all integrated in a single service. AWS Glue does not require you to manage infrastructure and supports various workloads such as ETL, ELT, and streaming, and can scale on demand to accommodate any data size and type.
For more information about how to use AWS Glue, see Use AWS Glue to migrate OceanBase data.
Tool selection recommendations
Tool |
Core advantage |
Scenarios |
|---|---|---|
| ODC | Dedicated management tool for OceanBase Database, deeply integrated with the Ecological ETL solution | Development and operations of OceanBase Database |
| Flink | Real-time stream processing and high-throughput batch processing | Stream data monitoring and real-time report generation |
| dbt | Standardized data modeling and traceable analysis process | Analytical data warehouse modeling (such as OceanBase Database) |
| DataWorks | Full-stack support for the Alibaba Cloud ecosystem and visual development | Cloud-native data lake construction and offline analysis |
| AWS Glue | Serverless ETL and seamless integration with the AWS ecosystem | AWS cloud data lake construction and batch processing |
