A C library in an R Package
There are good online tutorials for how to get started with the Rcpp package – in particular the documentation for Rcpp.
But what if you want to use the functionality of a C(++) library in an R package?
This simple demonstration package implements a mymean
and a mysum
function for vectors using the Rcpp package.
The mysum
function is what I ultimately want in its own library, while mymean
is a function in the R package that uses mysum
.
The mysum
function is the same throughout this post:
In this post I show how we move from one big cpp
file to a mysum
library in a separate folder.
If you want an example of including a large C library in an R package, check out the GitHub repo for the haven package.
Create the basic package
Create a minimal package with Rcpp. With RStudio (File
> New Project...
> R Package using Rcpp
) or these commands:
devtools::create()
usethis::use_rcpp()
I prefer to use the roxygen2 package for documentation.
(If the project is created using the point and click way, I first delete the NAMESPACE
file.)
I therefore add the following to (e.g.) R/utils.R
to update the NAMESPACE
file correctly when running devtools::document
:
#' @useDynLib mypkg, .registration = TRUE
#' @importFrom Rcpp sourceCpp
NULL
With the mymean.cpp
that is introduced shortly, the folder hierarchy in the package directory is now:
mypkg
├── DESCRIPTION
├── man
├── mypkg.Rproj
├── NAMESPACE
├── R
│ └── utils.R
└── src
└── mymean.cpp
When installing the package Hadley Wickham encourages the Build & Reload
button in RStudio’s Build
pane.
In the first build two extra files are automatically generated by Rcpp: R/RcppExports.R
and src/RcppExports.cpp
.
The build part related to the C++ code is (sans the special compiler flags):
g++ ... -c RcppExports.cpp -o RcppExports.o
g++ ... -c mymean.cpp -o mymean.o
g++ ... -o mypkg.so RcppExports.o rcpp_hello_world.o
This reads as follows:
mymean.cpp
and RcppExports.cpp
are each compiled to an object file. The object files are then linked into a shared object file mypkg.so
that R can call.
We will see how these compiler commands change during the post.
Only one C++ file
The initial content of src/mymean.cpp
are two function – one for summing the elements of a vector and one to compute the average of the elements in a vector:
#include <stddef.h>
#include <Rcpp.h>
using namespace Rcpp;
double mysum(size_t n, double *X) {
double s = 0.0;
for (size_t i = 0; i < n; ++i) {
s += X[i];
}
return s;
}
//' @export
// [[Rcpp::export]]
double mymean(NumericVector x) {
size_t n = x.size();
double total = mysum(n, x.begin());
return total / n;
}
There is one small trick here: mysum
's second argument X
is a pointer to an array of double
s. This is the same as the pointer to the first element of x
in mymean
, which is available as x.begin()
.
Using size_t
(and therefore also the stddef
header) for the size of X
is probably overkill for this demo, but it fells more “C like”.
Include library in separate file
By default, any cpp
file in the src
folder is compiled when running devtools::install
.
You need a header file to make the functions available between files as in any C(++) project, but that is all.
Include following as src/mysum.cpp
:
#include <stddef.h>
double mysum(size_t n, double *X) {
double s = 0.0;
for (size_t i = 0; i < n; ++i) {
s += X[i];
}
return s;
}
The header file src/mysum.h
defines the mysum
function in the include guard of the same name:
#ifndef MYSUM
#define MYSUM
double mysum(size_t n, double *X);
#endif
In src/mean.cpp
we replace the mysum
function with an include of the header file:
#include <Rcpp.h>
using namespace Rcpp;
#include "mysum.h"
//' @export
// [[Rcpp::export]]
double mymean(NumericVector x) {
int n = x.size();
double total = mysum(n, x.begin());
return total / n;
}
Now mysum.cpp
is compiled separately and the object file mysum.o
is included in the shared object file.
g++ ... -c RcppExports.cpp -o RcppExports.o
g++ ... -c mymean.cpp -o mymean.o
g++ ... -c mysum.cpp -o mysum.o
g++ ... -o mypkg.so RcppExports.o mymean.o mysum.o
Include library in separate folder
We move on to have mysum
in a subfolder of src
.
Include library as C++
Now mysum.cpp
is moved to the folder src/sum
. The header can also be moved to src/sum
, but it is not required.
A Makefile is needed now that tells Rcpp which files to compile, what the object files are called and what paths to include. The file is called Makevars
on *nix and Makevars.win
on Windows and is in the src
folder:
CPPFILES = $(wildcard *.cpp sum/*.cpp)
SOURCES = $(CPPFILES)
OBJECTS = $(CPPFILES:.cpp=.o)
PKG_CXXFLAGS = -Isum
The CPPFILES
are all the cpp
files in src
and src/sum
.
The OBJECTS
files have the same base name as the CPPFILES
, but their filetype is o
instead of cpp
.
Finally, if the header file mysum.h
is moved to src/sum
this directory must be included in the compiler’s list of directories.
The only difference in the compiler commands is the change of location for the mysum
files:
g++ ... -c RcppExports.cpp -o RcppExports.o
g++ ... -c mymean.cpp -o mymean.o
g++ ... -c sum/mysum.cpp -o sum/mysum.o
g++ ... -o mypkg.so mymean.o RcppExports.o sum/mysum.o
Include library as C
In the src/Makevars
we now have a list of C++ files and a list of C files.
The union of these are the SOURCES
files.
CFILES = $(wildcard sum/*.c)
CPPFILES = $(wildcard *.cpp)
SOURCES = $(CFILES) $(CPPFILES)
OBJECTS = $(CFILES:.c=.o) $(CPPFILES:.cpp=.o)
PKG_CXXFLAGS = -Isum
Using a C library in a C++ library requires a few special lines in the header file, src/sum/mysum.h
:
#ifndef MYSUM
#define MYSUM
#ifdef __cplusplus
extern "C" {
#endif
double mysum(size_t n, double *X);
#ifdef __cplusplus
}
#endif
#endif
In the compiler commands the base C compiler gcc
is now used instead of the C++ compiler:
gcc ... -c sum/mysum.c -o sum/mysum.o
g++ ... -c mymean.cpp -o mymean.o
g++ ... -c RcppExports.cpp -o RcppExports.o
g++ ... -o mypkg.so sum/mysum.o mymean.o RcppExports.o
The final file structure in mypkg
:
mypkg
├── DESCRIPTION
├── man
├── mypkg.Rproj
├── NAMESPACE
├── R
│ ├── RcppExports.R
│ └── utils.R
└── src
├── Makevars
├── mymean.cpp
├── RcppExports.cpp
└── sum
├── mysum.c
└── mysum.h