Contributor
Ajay-1

gghexbin: An R package to enable the creation of high quality hexagonally binned graphs that can exploit all of ggplot2’s functionality


Mentors
Thomas Philips, Doug Martin
Organization
The R Project for Statistical Computing

Hexagonal binning allows large datasets to be visualized and prevents overplotting of data, where points overlap and turn into a solid mass and information from the visualization is lost. This is useful in finance because of the large datasets used in studies (the U.S. stock market has 4,000+ securities), but is also applicable to other fields. Hexagonal binning is conceptually simple. Bin size scales with the number of points, allowing any number of points to be represented by a single bin, helping the observer to clearly see the density of data and additional visualizations (regression lines, groupings, etc.). Currently, such plots are generated using the {hexbin} package, which has two major issues.

  1. It is built on top of {lattice}, which is being superseded by {ggplot2}.
  2. Additional visualizations are handled poorly or are non-existent, as {hexbin} doesn't integrate well with other packages.

We will create a new package, {gghexbin}, to replace the existing {hexbin} package, which will integrate well with {ggplot2}, the most powerful and widely used visualization package for R. We expect that {gghexbin} will give improved performance and allow for use in a variety of fields.