Adaptive Importance Sampling with Automatic Model Selection in Value Function Approximation - 百度学术

高级搜索

包含全部检索词

包含精确检索词

包含至少一个检索词

不包含检索词

出现检索词的位置

文章任何位置

位于文章标题

作者

机构

出版物期刊

会议

发表时间 -

语言检索范围不限

不限英文中文

论文查重 首单免费 

Adaptive Importance Sampling with Automatic Model Selection in Value Function Approximation

来自 Semantic Scholar

阅读量：

37

作者：

H Hachiya，T Akiyama，M Sugiyama，J Peters

展开

摘要：

Off-policy reinforcement learning is aimed at efficiently reusing data samples gathered in the past. A common approach is to use importance sampling techniques for compensating for the bias caused by the difference between data-collecting policies and the target policy. However, existing off-policy methods do not often take the variance of value function estimators explicitly into account and therefore their performance tends to be unstable. To cope with this problem, we propose using an adaptive importance sampling technique which allows us to actively control the trade-off between bias and variance. We further provide a method for optimally determining the trade-off parameter based on a statistical machine learning theory.

展开

关键词：

Off-policy Reinforcement learning Value function approximation Importance sampling

被引量：

年份：

2007

收藏引用批量引用报错分享

全部来源免费下载求助全文

Semantic Scholar

掌桥科研

ACM

ResearchGate (全网免费下载)

ResearchGate 查看更多

ias.informatik.tu-darmstadt.de (全网免费下载)

学术范 (全网免费下载)

ResearchGate (全网免费下载)

ias.informatik.tu-darmstadt.de (全网免费下载)

学术范 (全网免费下载)

通过文献互助平台发起求助，成功后即可免费获取论文全文。

请先登入

我们已与文献出版商建立了直接购买合作。

你可以通过身份认证进行实名认证，认证成功后本次下载的费用将由您所在的图书馆支付

您可以直接购买此文献，1~5分钟即可下载全文，部分资源由于网络原因可能需要更长时间，请您耐心等待哦~

身份认证全文购买

相似文献

参考文献

引证文献

研究点推荐

Value Function Approximation Importance sampling Automatic Model Selection

引用走势

2009

被引量：24

站内活动

辅助模式

0

引用

文献可以批量引用啦~
欢迎点我试用！

引用