• 初步探索 superset新增导出excel功能


    一、环境

    • Python 3.8.10
    • Node v16.16.0
    • Windows 10 专业版 64 位操作系统
    • Superset 1.5

    二、前端操作步骤

    我们可以参照csv的导出方式,去扩展excel。以以下页面为例

    2.1 确定导出csv的地方

    2.1.1查找样式

    发现有个chart-slice的类名,以这个关键词进行全局搜索发现Chart下有包含

    2.1.2 找到渲染导出csv的button组件

    进入chart组件可以看到关联组件SliceHeader

    按照查找思路SliceHeader --> SliceHeaderControls,该文件下全局搜索CSV,可以确定csv的button组件导出位置

    2.2 写入导出EXCEL组件

    \superset-1.5\superset-frontend\src\dashboard\components\SliceHeaderControls\index.tsx

    2.2.1 添加导出菜单

    目前可以看菜单按钮有两块

    1. {this.props.slice.viz_type !== 'filter_box' &&
    2. this.props.supersetCanCSV && (
    3. <Menu.Item key={MENU_KEYS.EXPORT_CSV}>{t('Export CSV')}Menu.Item>
    4. )}
    5. {this.props.slice.viz_type !== 'filter_box' &&
    6. isFeatureEnabled(FeatureFlag.ALLOW_FULL_CSV_EXPORT) &&
    7. this.props.supersetCanCSV &&
    8. isTable && (
    9. <Menu.Item key={MENU_KEYS.EXPORT_FULL_CSV}>
    10. {t('Export full CSV')}
    11. Menu.Item>
    12. )}

    依葫芦画瓢,在后面新增两个excel导出按钮

    1. {this.props.slice.viz_type !== 'filter_box' &&
    2. this.props.supersetCanExcel && (
    3. <Menu.Item key={MENU_KEYS.EXPORT_EXCEL}>{t('Export EXCEL')}Menu.Item>
    4. )}
    5. {this.props.slice.viz_type !== 'filter_box' &&
    6. isFeatureEnabled(FeatureFlag.ALLOW_FULL_EXCEL_EXPORT) &&
    7. this.props.supersetCanExcel &&
    8. isTable && (
    9. <Menu.Item key={MENU_KEYS.EXPORT_FULL_EXCEL}>
    10. {t('Export full EXCEL')}
    11. Menu.Item>
    12. )}

     2.2.2 添加相应字段

    此时应该有很多未定义报错,所以需要添加以下代码

    找到MENU_KEYS,在末尾添加

    1. EXPORT_FULL_CSV: 'export_full_csv',
    2. EXPORT_FULL_EXCEL: 'export_full_excel',

    找到\superset-1.5\superset-frontend\packages\superset-ui-core\src\utils\featureFlags.ts中的FeatureFlag,在末尾添加

    ALLOW_FULL_EXCEL_EXPORT = 'ALLOW_FULL_EXCEL_EXPORT',

    2.2.3 消除props不通过的校验

    现在还有些问题是因为有些属性没有声明,props校验不通过,所以需要添加属性。目前有三个属性未添加supersetCanExcel、exportFullExcel、exportExcel。

    • \superset-1.5\superset-frontend\src\dashboard\components\SliceHeaderControls\index.tsx

    找到export interface SliceHeaderControlsProps,在末尾添加

    1. exportExcel?: (sliceId: number) => void;
    2. exportFullExcel?: (sliceId: number) => void;
    3. supersetCanExcel?: boolean;
    • \superset-1.5\superset-frontend\src\dashboard\components\SliceHeader\index.tsx

    找到const SliceHeader: FC,在末尾添加

    1. exportExcel = () => ({}),
    2. exportFullExcel,
    3. supersetCanExcel = false,
    • superset-1.5\superset-frontend\src\dashboard\containers\Chart.jsx

    在supersetCanCSV: !!dashboardInfo.superset_can_csv, 后面新增

    supersetCanExcel: !!dashboardInfo.superset_can_excel,
    • \superset-1.5\superset-frontend\src\dashboard\actions\hydrate.js

    由于数据源来自于dashboardInfo,所以我们找到来源为hydrate.js,此时发现这个步骤牵扯到权限问题,目前还未实现,所以我们暂时写死

    1. // superset_can_excel: findPermission('can_excel', 'Superset', roles),
    2. superset_can_excel: true,

    2.3 添加Excel导出方法

    2.3.1 \superset-1.5\superset-frontend\src\dashboard\components\gridComponents\Chart.jsx

    现在需要新增导出excel方法,同样参照exportCSV添加以下代码

    1. exportExcel(isFullExcel = false) {
    2. this.props.logEvent(LOG_ACTIONS_EXPORT_EXCEL_DASHBOARD_CHART, {
    3. slice_id: this.props.slice.slice_id,
    4. is_cached: this.props.isCached,
    5. });
    6. exportChart({
    7. formData: isFullExcel
    8. ? { ...this.props.formData, row_limit: this.props.maxRows }
    9. : this.props.formData,
    10. // resultType: 'results',
    11. resultType: 'full',
    12. resultFormat: 'excel',
    13. });
    14. }
    15. exportFullExcel() {
    16. this.exportExcel(true);
    17. }

    2.3.2 \superset-1.5\superset-frontend\src\logger\LogUtils.ts

    需要添加

    1. export const LOG_ACTIONS_EXPORT_EXCEL_DASHBOARD_CHART =
    2. 'export_excel_dashboard_chart';

    2.4 修改exportChart相关导出方法

    exportChart是主要的导出csv的方法

    1. export const exportChart = ({
    2. formData,
    3. resultFormat = 'json',
    4. resultType = 'full',
    5. force = false,
    6. ownState = {},
    7. }) => {
    8. let url;
    9. let payload;
    10. if (shouldUseLegacyApi(formData)) {
    11. const endpointType = getLegacyEndpointType({ resultFormat, resultType });
    12. url = getExploreUrl({
    13. formData,
    14. endpointType,
    15. allowDomainSharding: false,
    16. });
    17. payload = formData;
    18. } else {
    19. url = '/api/v1/chart/data';
    20. payload = buildV1ChartDataPayload({
    21. formData,
    22. force,
    23. resultFormat,
    24. resultType,
    25. ownState,
    26. });
    27. }
    28. console.log("url:", url, payload);
    29. postForm(url, payload);
    30. };

    shouldUseLegacyApi这个函数主要是根据useLegacyApi来判断调用哪个接口,useLegacyApi是在图表注册时就已经确定好了。里面有两个函数,getLegacyEndpointType(生成最终类型的函数), getExploreUrl(生成导出url的函数)。

    2.4.1 修改getLegacyEndpointType

    1. export const getLegacyEndpointType = ({ resultType, resultFormat }) =>
    2. (resultFormat === 'csv' || resultFormat === 'excel' ) ? resultFormat : resultType;

    2.4.2 修改 getExploreUrl

    找到

    1. if (endpointType === 'csv') {
    2. search.csv = 'true';
    3. }

    在后面新增

    1. if(endpointType === 'excel') {
    2. search.excel = 'true';
    3. }

    前端已经修改完毕,最终修改结果

     

    三、后端操作步骤

    此时点击export excel可以发现导出的连接未http://localhost:5000/superset/explore_json/?form_data=%7B%22slice_id%22%3A351%7D&excel=true

    我们可以推测后台的实现方法和explore_json有关,所以在后端查找该方法。

    \superset-1.5\superset\views\core.py

    1. @api
    2. @has_access_api
    3. @handle_api_exception
    4. @event_logger.log_this
    5. @expose(
    6. "/explore_json///",
    7. methods=EXPLORE_JSON_METHODS,
    8. )
    9. @expose("/explore_json/", methods=EXPLORE_JSON_METHODS)
    10. @etag_cache()
    11. @check_resource_permissions(check_datasource_perms)
    12. def explore_json(
    13. self, datasource_type: Optional[str] = None, datasource_id: Optional[int] = None
    14. ) -> FlaskResponse:
    15. """Serves all request that GET or POST form_data
    16. This endpoint evolved to be the entry point of many different
    17. requests that GETs or POSTs a form_data.
    18. `self.generate_json` receives this input and returns different
    19. payloads based on the request args in the first block
    20. TODO: break into one endpoint for each return shape"""
    21. response_type = ChartDataResultFormat.JSON.value
    22. responses: List[Union[ChartDataResultFormat, ChartDataResultType]] = list(
    23. ChartDataResultFormat
    24. )
    25. responses.extend(list(ChartDataResultType))
    26. for response_option in responses:
    27. if request.args.get(response_option) == "true":
    28. response_type = response_option
    29. break
    30. # Verify user has permission to export CSV file
    31. if (
    32. response_type == ChartDataResultFormat.CSV
    33. and not security_manager.can_access("can_csv", "Superset")
    34. ):
    35. return json_error_response(
    36. _("You don't have the rights to ") + _("download as csv"),
    37. status=403,
    38. )
    39. form_data = get_form_data()[0]
    40. try:
    41. datasource_id, datasource_type = get_datasource_info(
    42. datasource_id, datasource_type, form_data
    43. )
    44. force = request.args.get("force") == "true"
    45. # TODO: support CSV, SQL query and other non-JSON types
    46. if (
    47. is_feature_enabled("GLOBAL_ASYNC_QUERIES")
    48. and response_type == ChartDataResultFormat.JSON
    49. ):
    50. # First, look for the chart query results in the cache.
    51. try:
    52. viz_obj = get_viz(
    53. datasource_type=cast(str, datasource_type),
    54. datasource_id=datasource_id,
    55. form_data=form_data,
    56. force_cached=True,
    57. force=force,
    58. )
    59. payload = viz_obj.get_payload()
    60. # If the chart query has already been cached, return it immediately.
    61. if payload is not None:
    62. return self.send_data_payload_response(viz_obj, payload)
    63. except CacheLoadError:
    64. pass
    65. # Otherwise, kick off a background job to run the chart query.
    66. # Clients will either poll or be notified of query completion,
    67. # at which point they will call the /explore_json/data/
    68. # endpoint to retrieve the results.
    69. try:
    70. async_channel_id = async_query_manager.parse_jwt_from_request(
    71. request
    72. )["channel"]
    73. job_metadata = async_query_manager.init_job(
    74. async_channel_id, g.user.get_id()
    75. )
    76. load_explore_json_into_cache.delay(
    77. job_metadata, form_data, response_type, force
    78. )
    79. except AsyncQueryTokenException:
    80. return json_error_response("Not authorized", 401)
    81. return json_success(json.dumps(job_metadata), status=202)
    82. viz_obj = get_viz(
    83. datasource_type=cast(str, datasource_type),
    84. datasource_id=datasource_id,
    85. form_data=form_data,
    86. force=force,
    87. )
    88. return self.generate_json(viz_obj, response_type)
    89. except SupersetException as ex:
    90. return json_error_response(utils.error_msg_from_exception(ex), 400)

    3.1 添加类型

    根据上述代码,我们需要在ChartDataResultFormat里面新增一个excel的类型

    \superset-1.5\superset\common\chart_data.py

    1. class ChartDataResultFormat(str, Enum):
    2. """
    3. Chart data response format
    4. """
    5. CSV = "csv"
    6. JSON = "json"
    7. EXCEL = "excel" // 新增

    3.2 实现generate_json

    \superset-1.5\superset\views\core.py

    找到最后return的地方,可以看到需要实现generate_json,在这个方法新增如下代码

    1. if response_type == ChartDataResultFormat.EXCEL:
    2. return ExcelResponse(
    3. viz_obj.get_excel(), headers=generate_download_headers("xlsx")
    4. )

    3.3 实现ExcelResponse

    \superset-1.5\superset\views\base.py

    1. class ExcelResponse(Response): # pylint: disable=too-many-ancestors
    2. """
    3. Override Response to take into account xlsx encoding from config.py
    4. """
    5. charset = conf.get("EXCEL_EXPORT").get("encoding", "utf-8")

    3.4 实现get_excel

    \superset-1.5\superset\viz.py

    在头部导入

    import io

    在get_csv后面新增如下代码

    1. def get_excel(self) -> bytes:
    2. data = io.BytesIO()
    3. df = self.get_df_payload()["df"]
    4. include_index = not isinstance(df.index, pd.RangeIndex)
    5. df.to_excel(data, index=include_index, **config.get("EXCEL_EXPORT"))
    6. data.seek(0)
    7. return data.read()

    3.5 修改config.py

    \superset-1.5\superset\config.py

    CSV_EXPORT = {"encoding": "utf-8"} 后面新增

    EXCEL_EXPORT = {"encoding": "utf-8"}

    3.6 下载openpyxl

    此时调用会报错,原因是缺少openpyxl包,所以执行

    pip install openpyxl

    最后一个流程就大致完成了,我们可以在这个页面测试下

      下载成功

    四、总结

    这是superset二次开发支持excel的初步探索,目前还有一些问题未解决。如下:

    • Sqllab中点击.xlsx

     点击.xlsx,需要实现对应的/superset/excel/${this.props.query.id}方法。可以参考

    \superset-1.5\superset\views\core.py中的这段实现csv的方法

    1. @has_access
    2. @event_logger.log_this
    3. @expose("/csv/")
    4. def csv( # pylint: disable=no-self-use,too-many-locals
    5. self, client_id: str
    6. ) -> FlaskResponse:
    7. """Download the query results as csv."""
    8. logger.info("Exporting CSV file [%s]", client_id)
    9. query = db.session.query(Query).filter_by(client_id=client_id).one()
    10. try:
    11. query.raise_for_access()
    12. except SupersetSecurityException as ex:
    13. flash(ex.error.message)
    14. return redirect("/")
    15. blob = None
    16. if results_backend and query.results_key:
    17. logger.info("Fetching CSV from results backend [%s]", query.results_key)
    18. blob = results_backend.get(query.results_key)
    19. if blob:
    20. logger.info("Decompressing")
    21. payload = utils.zlib_decompress(
    22. blob, decode=not results_backend_use_msgpack
    23. )
    24. obj = _deserialize_results_payload(
    25. payload, query, cast(bool, results_backend_use_msgpack)
    26. )
    27. columns = [c["name"] for c in obj["columns"]]
    28. df = pd.DataFrame.from_records(obj["data"], columns=columns)
    29. logger.info("Using pandas to convert to CSV")
    30. else:
    31. logger.info("Running a query to turn into CSV")
    32. if query.select_sql:
    33. sql = query.select_sql
    34. limit = None
    35. else:
    36. sql = query.executed_sql
    37. limit = ParsedQuery(sql).limit
    38. if limit is not None and query.limiting_factor in {
    39. LimitingFactor.QUERY,
    40. LimitingFactor.DROPDOWN,
    41. LimitingFactor.QUERY_AND_DROPDOWN,
    42. }:
    43. # remove extra row from `increased_limit`
    44. limit -= 1
    45. df = query.database.get_df(sql, query.schema)[:limit]
    46. csv_data = csv.df_to_escaped_csv(df, index=False, **config["CSV_EXPORT"])
    47. quoted_csv_name = parse.quote(query.name)
    48. response = CsvResponse(
    49. csv_data, headers=generate_download_headers("csv", quoted_csv_name)
    50. )
    51. event_info = {
    52. "event_type": "data_export",
    53. "client_id": client_id,
    54. "row_count": len(df.index),
    55. "database": query.database.name,
    56. "schema": query.schema,
    57. "sql": query.sql,
    58. "exported_format": "csv",
    59. }
    60. event_rep = repr(event_info)
    61. logger.debug(
    62. "CSV exported: %s", event_rep, extra={"superset_event": event_info}
    63. )
    64. return response

    • 导出excel权限问题

    其余的导出excel功能也可以参考这种思路去实现

    参考:

    注:若有错误,欢迎指出

  • 相关阅读:
    【BOOST C++ 15 泛型编程】(1)Boost.TypeTraits
    血氧仪方案组成结构设计分析
    XTU-OJ 1255-勾股数
    手把手教会 VS2022 设计 Winform 高DPI兼容程序 (net461 net6.0 双出)
    1028万!重庆统计局招标大数据平台,开建统计数据中台
    (六)什么是Vite——热更新时vite、webpack做了什么
    Pinia 及其数据持久化 Vue新一代状态管理插件
    java-php-python-ssm抑抑心理交流平台计算机毕业设计
    基于YOLOv8的安全帽检测系统(2):Gold-YOLO,遥遥领先,助力行为检测 | 华为诺亚NeurIPS23
    小孩近视用白炽灯好吗?使用护眼台灯有啥好处?
  • 原文地址:https://blog.csdn.net/qq_16933879/article/details/126034225