Data processing tools serve as the core bridge connecting raw data to OceanBase Database. Below are brief introductions to the mainstream tools.
OceanBase Developer Center (ODC)
OceanBase Developer Center (ODC) is a database graphical development tool and a collaborative platform for data development and production change management. ODC is available in desktop and web versions. The desktop version focuses on database development capabilities and supports Windows, Mac, and Linux operating systems, offering lightweight and easy deployment features. The web version provides tool capabilities along with management and collaboration features, emphasizing the security, compliance, and efficiency of database changes.
For more information about how to use ODC, see OceanBase Developer Center (ODC).
ETL tools in the ecosystem
Flink
Flink is an open-source framework for large-scale data processing and analysis. Flink CDC is a component on this platform that captures database change events. Together, they provide a powerful real-time data processing and analysis solution.
For more information about how to use Flink, see Use Flink CDC to synchronize data from a MySQL database to OceanBase Database and Use Flink CDC to migrate data from OceanBase Database to a MySQL database.
dbt (Data Build Tool)
dbt (data build tool) is an open-source data transformation tool that enables data conversion through SQL, converting commands into tables or views. This topic describes how to use dbt-oceanbase to analyze data in OceanBase Database.
For more information about how to use dbt, see Analyze data in OceanBase Database by using dbt.
DataWorks
DataWorks is a data development and governance platform on Alibaba Cloud. It provides unified end-to-end processing capabilities for data warehouses, data lakes, and hybrid data lake/warehouse solutions based on big data engines such as MaxCompute, Hologres, EMR, AnalyticDB, and CDP. DataWorks offers various features, including data integration, data development, data modeling, and data analysis. For more information about how to use DataWorks, see DataWorks.
AWS Glue
AWS Glue is a serverless data integration service that helps users easily discover, prepare, move, and integrate data from multiple sources to support analysis, machine learning, and application development. It provides comprehensive data integration capabilities, including data discovery, modern ETL, data cleaning and transformation, and centralized cataloging, all integrated into a single service. AWS Glue does not require users to manage infrastructure, supports various workloads such as ETL, ELT, and streaming, and can scale on demand to accommodate any data size and type.
For more information about how to use AWS Glue, see Migrate data from OceanBase Database by using AWS Glue.
Tool selection recommendations
| Tool | Core advantages | Scenarios |
|---|---|---|
| ODC | OceanBase Database dedicated management tool, deeply integrated with the ecosystem | Development and O&M of OceanBase Database |
| Flink | Real-time stream processing and high-throughput batch processing | Stream data monitoring, real-time report generation |
| dbt | Standardized data modeling and traceable analysis processes | Modeling analytical data warehouses (such as OceanBase Database) |
| DataWorks | Full-stack support in the Alibaba Cloud ecosystem, visual development | Cloud-native data lakes and offline analysis |
| AWS Glue | Serverless ETL, seamless integration with the AWS ecosystem | Building data lakes and batch processing in the AWS cloud |
