Data processing tools serve as the core bridge connecting raw data to OceanBase Database. Below is a brief introduction to the mainstream tools.
OceanBase Developer Center (ODC)
OceanBase Developer Center (ODC) is a database graphical development tool and a collaborative platform for data development and production change management. ODC is available in two editions: desktop and web. The desktop edition focuses on database development capabilities and supports Windows, macOS, and Linux operating systems, offering lightweight and easy deployment features. The web edition provides tool capabilities while also offering management and collaboration features, emphasizing the security, compliance, and efficiency of database changes.
For more information about how to use ODC, see OceanBase Developer Center (ODC).
ETL tools in the ecosystem
Flink
Flink is an open-source framework for large-scale data processing and analysis. Flink CDC is a component on this platform that captures database change events. Together, they provide a powerful solution for real-time data processing and analysis.
For more information about how to use Flink, see Use Flink CDC to synchronize data from a MySQL database to OceanBase Database and Use Flink CDC to migrate data from OceanBase Database to a MySQL database.
dbt (Data Build Tool)
dbt (data build tool) is an open-source data transformation tool that can convert commands into tables or views using SQL. This topic describes how to use dbt-oceanbase to analyze data in OceanBase Database.
For more information about how to use dbt, see Analyze OceanBase data by using dbt.
DataWorks
DataWorks is a data development and governance platform on Alibaba Cloud. It provides unified end-to-end processing capabilities for data warehouses, data lakes, and hybrid data lake and warehouse solutions based on big data engines such as MaxCompute, Hologres, EMR, AnalyticDB, and CDP. DataWorks offers various features such as data integration, data development, data modeling, and data analysis. For more information about how to use DataWorks, see DataWorks.
AWS Glue
AWS Glue is a serverless data integration service that helps you easily discover, prepare, move, and integrate data from multiple sources to support analysis, machine learning, and application development. It provides comprehensive data integration capabilities, including data discovery, modern ETL, data cleaning and transformation, and centralized cataloging, all integrated into a single service. AWS Glue does not require you to manage infrastructure, supports various workloads such as ETL, ELT, and streaming, and can be scaled on demand to handle any data size and type.
Tool selection recommendations
| Tool | Core advantages | Scenarios |
|---|---|---|
| ODC | OceanBase Database management tool with deep integration into the ecosystem | Development and O&M of OceanBase Database |
| Flink | Real-time stream processing and high-throughput batch processing | Stream data monitoring and real-time report generation |
| dbt | Standardized data modeling and traceable analysis processes | Modeling of analytical data warehouses (such as OceanBase Database) |
| DataWorks | Full-stack support for Alibaba Cloud ecosystem and visual development | Cloud-native data lakes and offline analysis |
| AWS Glue | Serverless ETL and seamless integration with the AWS ecosystem | Building data lakes and batch processing in AWS Cloud |