Assessing the accuracy and stability of variable selection methods for random forest modeling in ecology
作者: Eric W. FoxRyan A. HillScott G. LeibowitzAnthony R. OlsenDarren J. ThornbrughMarc H. Weber
作者单位: 1U.S. Environmental Protection Agency
2U.S. Environmental Protection Agency, Oak Ridge Institute for Science and Education (ORISE) Post-doctoral Participant
刊名: Environmental Monitoring and Assessment, 2017, Vol.189 (7)
来源数据库: Springer Nature Journal
DOI: 10.1007/s10661-017-6025-0
关键词: Random forest modelingVariable selectionModel selection biasNational rivers and streams assessmentStreamCat datasetBenthic macroinvertebrates
英文摘要: Random forest (RF) modeling has emerged as an important statistical learning method in ecology due to its exceptional predictive performance. However, for large and complex ecological data sets, there is limited guidance on variable selection methods for RF modeling. Typically, either a preselected set of predictor variables are used or stepwise procedures are employed which iteratively remove variables according to their importance measures. This paper investigates the application of variable selection methods to RF models for predicting probable biological stream condition. Our motivating data set consists of the good/poor condition of n = 1365 stream survey sites from the 2008/2009 National Rivers and Stream Assessment, and a large set ( p = 212) of landscape features from the...
原始语种摘要: Random forest (RF) modeling has emerged as an important statistical learning method in ecology due to its exceptional predictive performance. However, for large and complex ecological data sets, there is limited guidance on variable selection methods for RF modeling. Typically, either a preselected set of predictor variables are used or stepwise procedures are employed which iteratively remove variables according to their importance measures. This paper investigates the application of variable selection methods to RF models for predicting probable biological stream condition. Our motivating data set consists of the good/poor condition of n = 1365 stream survey sites from the 2008/2009 National Rivers and Stream Assessment, and a large set ( p = 212) of landscape features from the...
全文获取路径: Springer Nature  (合作)
分享到:
来源刊物:
影响因子:1.592 (2012)

×
关键词翻译
关键词翻译
  • selection 选择
  • variable 变量
  • modeling 制祝型
  • accuracy 准确度
  • estimate 估计
  • importance 重要性
  • National 国民牌大客车
  • stepwise 步进式
  • preselected 预定深度
  • develop 发展