• 【论文阅读】基于强化学习框架的A/B测试中的动态因果效应评估


    一.论文信息

    论文题目: Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning Framework.【基于强化学习框架的A/B测试中的动态因果效应评估(Causal Effects Evaluation)】

    发表年份: 2021

    期刊/会议: Journal of the American Statistical Association(中科院SCI期刊1区,影响因子:4.369)

    论文链接: https://www.tandfonline.com/doi/full/10.1080/01621459.2022.2027776

    作者团队: Chengchun, Xiaoyu Wang, Shikai Luo, Hongtu Zhu, Jieping Ye, Rui Song

    二.论文内容

    Abstract

    A/B testing, or online experiment is a standard business strategy to compare a new product with an old one in pharmaceutical, technological, and traditional industries. Major challenges arise in online experiments of two-sided marketplace platforms (e.g., Uber) where there is only one unit that receives a sequence of treatments over time. In those experiments, the treatment at a given time impacts current outcome as well as future outcomes. The aim of this article is to introduce a reinforcement learning framework for carrying A/B testing in these experiments, while characterizing the long-term treatment effects. Our proposed testing procedure allows for sequential monitoring and online updating. It is generally applicable to a variety of treatment designs in different industries. In addition, we systematically investigate the theoretical properties (e.g., size and power) of our testing procedure. Finally, we apply our framework to both simulated data and a real-world data example obtained from a technological company to illustrate its advantage over the current practice. A Python implementation of our test is available at https://github.com/callmespring/CausalRL. Supplementary materials for this article are available online.

    摘要

    A/B 测试或在线实验是一种标准的商业策略,用于将制药、技术和传统行业的新产品与旧产品进行比较。主要挑战出现在双边市场平台(例如优步)的在线实验中,其中只有一个单位随着时间的推移接受一系列治疗。在这些实验中,给定时间的治疗会影响当前结果以及未来结果。本文的目的是介绍一种强化学习框架,用于在这些实验中进行 A/B 测试,同时描述长期治疗效果。本文提议的测试程序允许顺序监控和在线更新。普遍适用于不同行业的多种处理设计。此外,本文系统地研究了测试程序的理论特性(例如,尺寸和功率)。最后,将此框架应用于模拟数据和从一家技术公司获得的真实数据示例,以说明其相对于当前实践的优势。

  • 相关阅读:
    OpenInventor/Coin3D 学习指南
    百度SEO优化技巧大揭秘(百度SEO优化策略,提升网站排名)
    lucene原理
    paddlepaddle/paddle 命令注入漏洞复现_$1500 CVE-2024-0934
    C#和Java,究竟选哪个方向?我只说事实,你自己分析……
    RK3399快速上手 | 03-rockchip rk3399 linux sdk的使用
    微信小程序(组件)----上传单张图片以及获取图片【wx.chooseMedia wx.uploadFile】
    小说阅读软件阅读界面设计
    MongoDB
    指针和数组笔试题深度剖析
  • 原文地址:https://blog.csdn.net/m0_38068876/article/details/126985796