MARS

Page history last edited by Anonymous 1 yr ago

 

MARS – Multivariate Adaptive Regression Splines

Summary

Type of tool

Application

Function

Data mining, species modelling

Online / Desktop

Desktop

Computer infrastructure

Windows, Linux, R

Development status

Commercial and recent freeware

Time of use

As a post process, after data is with the use

Licence

 

MARS - Multivariate Adaptive Regression Splines - is a regression modelling tool, able to separate relevant from irrelevant predictor variables.

 

Description

A surface plot from Salford Systems MARS.1

 

MARS is a hybrid between conventional regression and recursive partitioning methods. MARS uses piece-wise linear basis functions to define the modelled relationship. Basis functions are defined in pairs, using a knot to define inflection points, and coefficients to quantify the slopes of the non-zero sections. More than one knot (i.e. more than one pair of basis functions) can be specified for a predictor variable, allowing complex non-linear relationships to be fitted.

 

When fitting a MARS model, knots are chosen in a forward stepwise procedure. Candidate knots can be placed at any position within the range of each predictor variable to define a pair of basis functions. At each step, the model selects the knot and its corresponding pair of basis functions that give the greatest decrease in the residual sum of squares. Knot selection proceeds until some maximum model size is reached, after which a backwards-pruning procedure is applied and those basis functions that contribute least to model fit are progressively removed.

At this stage, a predictor variable can be dropped from the model completely if none of its basis functions contribute meaningfully to predictive performance.

 

The sequence of models generated from this process is then evaluated using generalized cross-validation, and the model with the best predictive fit is selected.2

 

MARS was developed in the 1990s by Jerry Friedman. MARS can be implemented in the statistics software R, and is available as a commercial stand-alone application.

 

Function

  • Analysis tools
  • User interface
    • Personal use
    • Raw data and visual presentation

 

Why use this tool?

MARS excels at finding optimal variable transformations and interactions, the complex data structure that often hides in high dimensional data. In doing so, this approach to data mining uncovers data patterns and relationships that are difficult, if not impossible, for other approaches to uncover.3

 

Who will use this tool?

  • Data users
    • Expert
  • Special skills are required

 

How will the tool be used?

  • Data can be imported as spreadsheet, database or SPSS files
  • Windows or Linux
  • Desktop application
  • MARS algorithm can be used within R, or as a commercial stand-alone application
  • User input required

 

Where in the data chain could this tool be used?

  • User’s machine

 

When could this tool be used?

  • As a post process, after data is with the user

 

Availability

Commercial

 

R Project for Statistical Computing

 

Comments

For species’ distribution modelling, MARS has been compared to other methods.4

 

 


2 Elith et al (2006) Novel methods improve prediction of species’ distributions from occurrence data Ecography 29: 129-151. Online appendix to this paper: E4596 http://www.oikos.ekol.lu.se/app.html

4 Elith et al (2006) Novel methods improve prediction of species’ distributions from occurrence data Ecography 29: 129-151. available at: http://www.blackwell-synergy.com/toc/eco/29/2

Comments (0)

You don't have permission to comment on this page.