R project for statistical computing

R is a free software environment for statistical computing and graphics

Technologies
c, javascript, c++, r-project, fortran
Topics
visualization, machine learning, data science, graphics, statistics
R is a free software environment for statistical computing and graphics

R is an integrated suite of software facilities for data manipulation, calculation and graphical display. It includes

  • an effective data handling and storage facility,
  • a suite of operators for calculations on arrays, in particular matrices,
  • a large, coherent, integrated collection of intermediate tools for data analysis,
  • graphical facilities for data analysis and display either on-screen or on hardcopy, and
  • a well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and output facilities.

The term “environment” is intended to characterize it as a fully planned and coherent system, rather than an incremental accretion of very specific and inflexible tools, as is frequently the case with other data analysis software.

R, like S, is designed around a true computer language, and it allows users to add additional functionality by defining new functions. Much of the system is itself written in the R dialect of S, which makes it easy for users to follow the algorithmic choices made. For computationally-intensive tasks, C, C++ and Fortran code can be linked and called at run time. Advanced users can write C code to manipulate R objects directly.

Many users think of R as a statistics system. We prefer to think of it of an environment within which statistical techniques are implemented. R can be extended (easily) via packages. There are about eight packages supplied with the R distribution and many more are available through the CRAN family of Internet sites covering a very wide range of modern statistics.

R has its own LaTeX-like documentation format, which is used to supply comprehensive documentation, both on-line in a number of formats and in hardcopy.

2017 Program

Successful Projects

Contributor
Balázs Dukai
Mentor
david.bucklin@gmail.com, Clement Calenge, Mathieu Basille
Organization
R project for statistical computing
Interactive trajectory tool for rpostgisLT
The goal of the project is to build an interactive trajectory analysis extension for the R package rpostgisLT, that was developed during the GSoC...
Contributor
Natalia da Silva
Mentor
Annette O'Connor, Heike Hofmann
Organization
R project for statistical computing
metawRite:Meta analysis update package, LSR (Living systematic review)
Living systematic reviews have been proposed as a new approach to deal with the main problem of traditional systematic reviews. Systematic reviews...
Contributor
Vandit Jain
Mentor
Sai Bhargav Yalamanchi, Giorgio Spedicato
Organization
R project for statistical computing
Markovchain package
This project aims to extend the current functionality and capabilities of the R package ‘markovchain’ in order to provide statisticians a more...
Contributor
Xia Zhang
Mentor
Genevera Allen, Michael Weylandt
Organization
R project for statistical computing
Graphical Models for Mixed Multi Modal Data
In this project, we propose a new package to make graphical models for mixed multi-modal data readily available to a wide audience. The proposed...
Contributor
Lorenz Walthert
Mentor
Yihui Xie, Kirill Müller
Organization
R project for statistical computing
Noninvasive source code formatting
A coherent coding style greatly simplifies collaborative work. This is easiest enforced by an automatic code formatter, but existing solutions to...
Contributor
mb706
Mentor
Lars Kotthoff, Bernd Bischl
Organization
R project for statistical computing
Operator Based Machine Learning Pipeline Construction
The package mlr is a comprehensive machine learning toolkit for R, providing a standardized interface to over sixty machine learning R packages, in...
Contributor
Wazeer Zulfikar
Mentor
Krishna Sankar, Tomasz Melcer
Organization
R project for statistical computing
Native R API for Tensorflow
R does not have a high-level modeling language for designing neural networks. Tensorflow, an open source python library, is a great tool for this...
Contributor
Coin Lewis-Beck
Mentor
Daniel Turek, Perry de Valpine
Organization
R project for statistical computing
NIMBLE Ecology Package
The goal of this project is to build a new R package providing high-level user interfaces to many kinds of ecological models and implementing the...
Contributor
Shubham-Chaturvedi
Mentor
Nathalie Villa-Vialaneix, Pierre Neuvial
Organization
R project for statistical computing
Constrained Hierarchical Agglomerative Clustering
Constrained HAC is useful in various application fields like ecology and bioinformatics.This project aims to build an efficient constrained HAC...
Contributor
Faizan Khan
Mentor
Toby Dylan Hocking, cpsievert
Organization
R project for statistical computing
Animated Interactive Plots (animint)
animint package in R allows animated data visualization which is a useful tool for obtaining an intuitive understanding of patterns in multivariate...
Contributor
BEN UBAH
Mentor
Marijan Kostrun, Hans W Borchers
Organization
R project for statistical computing
Control Systems Toolbox
This project proposes to develop a control-systems package for R. For many years, R has been used extensively for several data-related tasks. With...
Contributor
Qingyue Xu
Mentor
Narayani Barve, rohitmg, Vijay Barve, Thomas Vattakaven
Organization
R project for statistical computing
Parser and Crawler for Biodiversity checklists
Compiling taxonomic checklists from varied sources of data is a common task that biodiversity informaticians encounter. Data for checklists usually...
Contributor
Ashwin Agrawal
Mentor
Vijay Barve, Tom-Gu
Organization
R project for statistical computing
Biodiversity Data Cleaning
Data cleaning is a process used to determine inaccurate, incomplete, or unreasonable data and then improving the quality through correction of...
Contributor
Thiloshon Nagarajah
Mentor
Vijay Barve, Tom-Gu
Organization
R project for statistical computing
Integrating biodiversity data curation functionality
The importance of data in the biodiversity research has been repeatedly stressed in the recent times and various organizations have come together and...
Contributor
Samuel Borms
Mentor
kboudt, Keven Bluteau, ArdiaD
Organization
R project for statistical computing
Sentometrics: An integrated framework for text based multivariate time series modeling and forecasting
This project leads to the creation of the Sentometrics package that is designed to do time series analysis based on textual sentiment. Time series...
Contributor
Alexandre Almeida
Mentor
Adam Loy, Heike Hofmann
Organization
R project for statistical computing
Distributional Assessments with Q-Q Plots
Quantile-quantile plots (Q-Q plots) are a powerful way of visually diagnosing distributional assumptions of random variables. Q-Q plots have been...
Contributor
Xin Chen
Mentor
Sasha Aravkin, Daniel Hanson, Peter Carl, Doug Martin
Organization
R project for statistical computing
Risk and Performance Measure Standard Errors for Serially Correlated Returns
This project is focused on developing a Risk/Performance Standard Errors (RPSE) package that implements a new methodology based on statistical...
Contributor
Chindhanai Uthaisaad
Mentor
Brian Peterson, Kjell Konis, Thomas Philips, Peter Carl, Doug Martin
Organization
R project for statistical computing
Advancing_factorAnalytics
The project plan represents a very significant step forward for the factorAnalytics package by adding advanced methods to the fundamental factor...
Contributor
Jialin Ma
Mentor
Miguel Pignatelli, Toby Dylan Hocking
Organization
R project for statistical computing
Interactive Genome Browser in R
The project intends to provide an interactive and user-friendly way to visualizing track-based genomic data by wrapping the flexible TnT javascript...
Contributor
Jason Ge
Mentor
Xingguo, Tuo Zhao
Organization
R project for statistical computing
Active Set Based Second-order Algorithm for Sparse Learning
For sparse learning problems, such as sparse generalized linear models and sparse undirected graphical model estimation, the current R packages still...
Contributor
Rover Van
Mentor
Toby Dylan Hocking, Anuj Khare
Organization
R project for statistical computing
Speed optimizations for iregnet
The iregnet package is the first R package to support four types of censoring and elastic net (L1 + L2) regularization. Though it is already useful...
Contributor
Leah South
Mentor
Adam Johansen, Dirk Eddelbuettel
Organization
R project for statistical computing
Efficient SMC Algorithms in Rcpp
Sequential Monte Carlo (SMC) methods are powerful alternatives to standard Markov chain Monte Carlo (MCMC) for sampling from the posterior of complex...
Contributor
lwei
Mentor
Brian Peterson, Diego Klabjan, Matthew Dixon
Organization
R project for statistical computing
Integrated Oversampling for Time Series Classification
A significant number of learning problems involve the accurate classification of rare events or outliers from time series data. For example, the...
Contributor
Luis Antonio Damiano
Mentor
Brian Peterson, Michael Weylandt
Organization
R project for statistical computing
Bayesian Hierarchical Hidden Markov Models applied to financial time series.
The goal of this project is to replicate research in Hierarchical Hidden Markov Models (HHMM) applied to financial data. This model is a...
Contributor
Lindsay Rutter
Mentor
Roxane Legaie, Di Cook
Organization
R project for statistical computing
bigPint: Big multivariate data plotted interactively
Parallel coordinate plots, scatterplot matrices, and replicate line plots are useful visual tools to understand the relationship between variables in...
Contributor
Binxiang Ni
Mentor
Dirk Eddelbuettel, Qiang Kou
Organization
R project for statistical computing
Sparse matrix automatic conversion in RcppArmadillo
This project is aimed to complete the integration between R Matrix package and Armadillo Package. During this project, I am going to do such things: ...
Contributor
Robin Kohze
Mentor
Lindolfo Pedraza, ciroyo
Organization
R project for statistical computing
FireData: Connecting R to Firebase
R is one of the strongest players in data science. The aim is to connect its strength in data analysis with the actual data. By making it easier to...
Contributor
Pushpak Sarkar
Mentor
Kjell Konis, Yindeng Jiang, Peter Carl, Doug Martin
Organization
R project for statistical computing
Portfolio Construction and Risk Management with Unequal Returns Histories
The goal of this project is to implement the three methods in an “Unequal Histories” package that: (1) facilitates use of the methods in portfolio...
Contributor
Zhehui Chen
Mentor
Xingguo, Tuo Zhao
Organization
R project for statistical computing
A stochastic variational inference framework for probabilistic modeling toolbox in R
Stochastic variational inference is a powerful tool for analyzing probabilistic models, especially for large scale problem. In this project, our goal...
Contributor
Earo Wang
Mentor
Di Cook, Rob J Hyndman
Organization
R project for statistical computing
Tidy data structures and visual methods to support exploration of big temporal-context data
This new package aims to fit into the tidyverse and grammar of graphics suite to support and facilitate temporal-context data analysis and...
Contributor
Matthew Piekenbrock
Mentor
Mikhail Belkin, Michael Hahsler
Organization
R project for statistical computing
Estimating the Empirical Cluster Tree
The aim of this project is to provide a standalone, scalable, and extensible R package that unifies existing methodologies for estimating the...
Contributor
cdries
Mentor
Brian Peterson, kboudt, Peter Carl
Organization
R project for statistical computing
Improved functionality for higher order comoment estimation in PerformanceAnalytics
In this project I aim to improve estimation of the higher order comoment matrices currently implemented in the R packages PerformanceAnalytics and...
Contributor
Leopoldo Catania
Mentor
Brian Peterson, kboudt, ArdiaD
Organization
R project for statistical computing
Markov Switching GARCH models (MSGARCH) in R
Modeling the volatility of financial markets is central in risk management. A seminal contribution in this field was the development of the GARCH...