This topic describes the ecosystem tools compatible with OceanBase Database and the integration methods in Analytical Processing (AP) scenarios. AP scenarios focus on data analysis, reporting, data warehouses, real-time and offline data integration, and BI visualization. They complement TP (transactional processing) scenarios.
For more information about all ecosystem tools integrated in OceanBase Database, including those for TP, AP, and AI scenarios, see Integrations.
Overview
| Module | Tool | Integration item | Description | MySQL mode compatibility |
|---|---|---|---|---|
| Data integration | AWS Glue | AWS Glue | AWS Glue is a fully managed data integration service that helps you discover, prepare, move, and integrate data from multiple sources. It supports analysis, machine learning, and application development, and provides data discovery, modern ETL, data cleansing and transformation, and centralized cataloging. This topic describes how to use AWS Glue to migrate data from OceanBase Database. | Fully supported |
| Data integration | Apache Flink | Apache Flink and OceanBase Database data integration guide | Apache Flink is an open-source stream and batch processing engine that efficiently processes real-time data streams and batch data. This guide helps developers understand how to integrate Flink with OceanBase Database and choose the appropriate connector and integration solution based on the scenario. | Fully supported |
| Data integration | Apache Flink | Use Flink DirectLoad for OceanBase Database bypass loading | The Flink DirectLoad connector is specifically designed for OceanBase Database and uses bypass loading technology to write large volumes of data to the database with high throughput. This topic describes how to use this connector to implement bypass loading in OceanBase Database. | Fully supported |
| Data integration | Apache Flink | Use the Flink Connector OceanBase to synchronize data to OceanBase Database in real time | The Flink Connector OceanBase is based on JDBC and supports real-time data writing to OceanBase Database in MySQL or Oracle compatible mode. This topic describes how to use this connector to synchronize data from Flink to OceanBase Database in real time. | Fully supported |
| Data integration | dbt | Analyze OceanBase Database data by using dbt | dbt (data build tool) is an open-source data transformation tool that uses SQL to transform data into tables or views. This topic describes how to use dbt-oceanbase to analyze data in OceanBase Database. | Fully supported |
| Data integration | Spark Catalog | Connect to OceanBase Database by using Spark Catalog | OceanBase Database supports Spark Catalog. This topic describes how to configure and use OceanBase Spark Catalog. | Fully supported |
| Data integration | NiFi | Integrate OceanBase Database by using Apache NiFi | Apache NiFi is an automated data flow processing platform that efficiently and reliably transfers data between systems. By reading OceanBase Database binlogs, you can distribute data to file systems, Kafka, or HTTP endpoints. This topic describes how to integrate OceanBase Database by using NiFi. | Fully supported |
| Data integration | Dataphin | Synchronize OceanBase Database data by using Dataphin | Dataphin is a data construction and governance platform provided by Alibaba Cloud. This topic describes how to configure OceanBase Database as a data source in Dataphin to implement offline and real-time data integration. | Fully supported |
| Data integration | RisingWave CDC | Access OceanBase Database MySQL data by using RisingWave CDC | RisingWave is a streaming database. Its MySQL-CDC feature captures data changes in OceanBase Database in MySQL compatible mode in real time and processes and analyzes the data in streams. This topic describes how to access OceanBase Database MySQL data by using RisingWave CDC. | Fully supported |
| Orchestration | Airflow | Integrate OceanBase Database with Airflow | Apache Airflow is an open-source platform for developing, scheduling, and monitoring batch workflows. Workflows can be defined in Python and managed through a web interface. This topic describes how to integrate OceanBase Database with Airflow. | Fully supported |
| Orchestration | DolphinScheduler | Configure OceanBase Database as the data source for DolphinScheduler | DolphinScheduler is an open-source distributed workflow task scheduling system that supports multiple types of tasks. It supports OceanBase Database as a data source. In MySQL compatible mode, you can configure OceanBase Database in the same way as MySQL. This topic describes how to configure OceanBase Database as the data source for DolphinScheduler. | Fully supported |
| Visualization | Superset | Analyze data by using Superset and OceanBase Database | Superset is an open-source business intelligence tool for data exploration and visualization. This topic describes how to connect to OceanBase Database by using Superset and analyze data. | Fully supported |
| Visualization | Excel | Connect to OceanBase Database and retrieve data in Excel | Excel is a widely used spreadsheet and data visualization tool. This topic describes how to connect to OceanBase Database by using ODBC data sources in WPS Office Excel and Microsoft Excel to query and analyze data. | Fully supported |
| Visualization | Power BI | Connect to OceanBase Database and retrieve data in Power BI | Power BI is a business intelligence tool provided by Microsoft. It allows you to connect to multiple data sources, transform and analyze data, and create interactive data visualization reports. This topic describes how to connect to OceanBase Database and retrieve data in Power BI. | Fully supported |
| Visualization | Quick BI | Connect to OceanBase Database and retrieve data in Quick BI | Quick BI is a self-service data analysis and visualization service provided by Alibaba Cloud. It allows you to explore data by dragging and dropping fields and create charts and dashboards. This topic describes how to connect to OceanBase Database and retrieve data in Quick BI. | Fully supported |
| Visualization | Tableau | Connect to OceanBase Database in Tableau | Tableau is a data visualization tool that helps you discover data insights from data for users ranging from beginners to data scientists. It provides an intuitive interface and powerful data processing capabilities. This topic describes how to connect to OceanBase Database in Tableau. | Fully supported |
| Visualization | Guanyuan BI | Connect to OceanBase Database in Guanyuan BI | Guanyuan BI is a business intelligence and data analysis platform. This topic describes how to connect to OceanBase Database in Guanyuan BI to display data. | Fully supported |
| Visualization | Yonghong BI | Connect to OceanBase Database in vividime V11 | Yonghong BI (vividime) is a business intelligence and data analysis platform. This topic describes how to connect to OceanBase Database in vividime V11 to display data. | Fully supported |
Data integration
Data integration is an IT process that integrates data from different sources into one view to support analysis, reporting, and business decision-making. It is a foundational capability in AP scenarios. Data can be scattered across relational databases, files, applications, NoSQL databases, and cloud storage services. By integrating these data sources, you can create a unified data layer for analysis.
The tools listed in the Data integration module of the preceding table, such as Flink, dbt, Spark Catalog, NiFi, Dataphin, and RisingWave CDC, are all applicable to AP data pipelines and data warehouse construction.
Orchestration
In AP scenarios, orchestration tools are used to manage, schedule, and coordinate data processing tasks and data flows, such as ETL, data cleansing, verification, and publishing. By using workflow orchestration, you can execute analytical tasks in a stable and repeatable manner.
The tools listed in the Orchestration module of the preceding table, such as Airflow and DolphinScheduler, can be used in AP scenarios to schedule data jobs with OceanBase Database as the source or destination.
Visualization
In AP scenarios, visualization tools are used to convert data into charts and dashboards to support self-service analysis, reporting, and decision-making. The tools listed in the Visualization module of the preceding table, such as Superset, Excel, Power BI, Quick BI, Tableau, Guanyuan BI, and Yonghong BI, all support connection to OceanBase Database, making it easy to explore and display data in AP scenarios.
