JuliaConnectoR

This R-package provides a functionally oriented interface between R and Julia. The goal is to call functions from Julia packages directly as R functions. Julia functions imported via the JuliaConnectoR can accept and return R variables. It is also possible to pass R functions as arguments in place of Julia functions, which allows callbacks from Julia to R.

From a technical perspective, R data structures are serialized with an optimized custom streaming format, sent to a (local) Julia TCP server, and translated to Julia data structures by Julia. The results of function calls are likewise translated back to R. Complex Julia structures can either be used by reference via proxy objects in R or fully translated to R data structures.

More details about the package together with a comparison with competing solutions can be found in the article:

Lenz, S., Hackenberg, M., & Binder, H. (2022). The JuliaConnectoR: a functionally oriented interface for integrating Julia in R. Journal of Statistical Software, 101(6), 1–24. https://doi.org/10.18637/jss.v101.i06

Installation

The package can be installed in R (version ≥ 3.2) via CRAN:

install.packages("JuliaConnectoR")

Or the development version can be installed with:

devtools::install_github("stefan-m-lenz/JuliaConnectoR")

The package requires that Julia (version ≥ 1.0) is installed and that the Julia executable is in the system search PATH or that the JULIA_BINDIR environment variable is set to the bin directory of the Julia installation.

After you have installed Julia with juliaup, execute the command julia on the command line. This will do the actual installation of Julia and start Julia for the first time. On Linux and Windows, the JuliaConnectoR package should now be able to use the Julia installation, as the link to the Julia installation is on the PATH. On Mac, Julia might not be on the PATH when using e.g. RStudio. In this case, ensure that Julia is on the PATH in R, or set the JULIA_BINDIR variable, as described below.

The specific Julia version can optionally be specified by using the JULIA_BINDIR variable. This will take precedence if specified. When using juliaup, note that the JULIA_BINDIR variable must point to the actual installation directory of Julia. This is different from the directory that is returned when executing which julia on the command line, which is the directory that contains the link to the default Julia executable.

To get the proper value of the JULIA_BINDIR variable, execute Sys.BINDIR on the Julia command line. Then set the environment variable in R via Sys.setenv("JULIA_BINDIR" = "/your/path/to/Julia/bin") before starting Julia and the JuliaConnectoR should be able to discover the specified Julia installation.

Overview

The following table lists the most important functions exported by the package:

Function name	Description
`juliaImport`	Load a Julia package in Julia via `import` and return its functions and data types as an environment, such that the functions can be called directly in R
`juliaFun`	Create an R function that wraps a Julia function
`juliaCall`	Call any Julia function by name. Not needed for functions attached via `juliaImport` or created via `juliaFun`.
`juliaEval`	Evaluate a simple Julia expression (and return the result)
`juliaLet`	Evaluate Julia expressions with R variables in place of Julia variables employing a `let` block (and return the result)
`juliaGet`	Fully translate a Julia object to an R object
`juliaExpr`	Use a Julia expressions or refer to a Julia object via a string in R

For a detailed description of the functions with some examples, and for more details about the translation of data structures, please see the latest documentation.

Examples

Using Flux

With the JuliaConnectoR it is possible to use, e. g., the Flux Julia package for training a neural network on the famous iris data set.

You can see that the translation from Julia to R code is rather straightforward:

Below we see example code for training a neural network for classification on the famous iris data set in Julia and its translation in R. Both the Julia version of the script and the R version are two complete runnable examples, showing all important steps in training a neural network.

Julia script: Data preparation

using Pkg
Pkg.add(PackageSpec(name = "RDatasets", version = "0.7"))
Pkg.add(PackageSpec(name = "Flux", version = "0.14"))

# Import packages and set a seed
import Flux
using Random
Random.seed!(1);

# Load data and split it into training and test data
function prepare_data(x, labels)
   nsamples = size(x, 2)
   x = Matrix{Float32}(x)
   testidxs = randperm(nsamples)[1:(round(Int, nsamples*0.3))]
   trainidxs = setdiff(1:nsamples, testidxs)
   x_train = x[:, trainidxs]
   x_test = x[:, testidxs]
   y = Flux.onehotbatch(labels, unique(labels))
   y_train = y[:, trainidxs]
   y_test = y[:, testidxs]
   (training = (x = x_train, y = y_train),
         test = (x = x_test, y = y_test))
end

using RDatasets
import Tables
iris = dataset("datasets", "iris")
x = Tables.matrix(iris[:, 1:4])'
data = prepare_data(x, iris[:, :Species])
trainingdata = data.training
testdata = data.test

Julia script: model training and evaluation

# Load necessary features
import Flux
using Statistics

# Train the Flux model
model = Flux.Chain(
      Flux.Dense(4, 4, Flux.relu),
      Flux.Dense(4, 4, Flux.relu),
      Flux.Dense(4, 3),
      Flux.softmax)

loss(model, x, y) = Flux.crossentropy(model(x), y)
loss(model, data::NamedTuple) = loss(model, data.x, data.y)

function train_network!(model, x, y; epochs, callback)
   opt = Flux.setup(Flux.Adam(), model);
   for i in 1:epochs
      Flux.train!(loss, model, [(x, y)], opt)
      callback(i)
   end
end

epochs = 2500
train_losses = Vector{Float32}(undef, epochs);
test_losses = Vector{Float32}(undef, epochs);
train_network!(model, trainingdata.x, trainingdata.y, epochs = epochs,
      callback = (i) -> begin
         train_losses[i] = loss(model, trainingdata)
         test_losses[i] = loss(model, testdata)
      end)

# (Could plot losses here)

# Evaluate model
accuracy(model, data) =
      mean(Flux.onecold(model(data.x)) .== Flux.onecold(data.y))
accuracy(model, trainingdata)
accuracy(model, testdata)

Adapted R/JuliaConnectoR script: data preparation

library(JuliaConnectoR)

juliaEval('using Pkg; Pkg.add(PackageSpec(name = "Flux", version = "0.14"))')

# The Julia code can simply be reused
prepare_data <- juliaEval('
      import Flux
      using Random
      Random.seed!(1);

      function prepare_data(x, labels)
         nsamples = size(x, 2)
         x = Matrix{Float32}(x)
         testidxs = randperm(nsamples)[1:(round(Int, nsamples*0.3))]
         trainidxs = setdiff(1:nsamples, testidxs)
         x_train = x[:, trainidxs]
         x_test = x[:, testidxs]
         y = Flux.onehotbatch(labels, unique(labels))
         y_train = y[:, trainidxs]
         y_test = y[:, testidxs]
         (training = (x = x_train, y = y_train),
               test = (x = x_test, y = y_test))
      end')

x <- as.matrix(iris[, 1:4])
labels <- iris[, "Species"]
data <- prepare_data(t(x), labels)
trainingdata <- data$training
testdata <- data$test

Adapter R/JuliaConnectoR script: model training and evaluation

library(JuliaConnectoR)

# load Flux features available in R
Flux <- juliaImport("Flux")

juliaEval("using Statistics") # for Julia code only

juliaEval("import Random; Random.seed!(11);")
model <- Flux$Chain(
      Flux$Dense(4L, 4L, Flux$relu),
      Flux$Dense(4L, 4L, Flux$relu),
      Flux$Dense(4L, 3L),
      Flux$softmax)

loss <- juliaEval('loss(model, x, y) = Flux.crossentropy(model(x), y)
                  loss(model, data::NamedTuple) = loss(model, data.x, data.y)')

train_network <- juliaEval('
   function train_network!(model, x, y; epochs, callback)
      opt = Flux.setup(Flux.Adam(), model);
      for i in 1:epochs
         Flux.train!(loss, model, [(x, y)], opt)
         callback(i)
      end
   end')

plotLoss <- function(epoch, loss_train, loss_test, epochs) {
   if (epoch == 1) {
      ymax <- max(loss_train, loss_test)
      plot(x = c(1, 1), y = c(loss_train, loss_test),
           xlim = c(0, epochs), ylim = c(0, ymax*1.1),
           col = c("red", "blue"), xlab = "Epoch", ylab = "Loss")
      legend("topright", legend = c("Training data", "Test data"),
             col = c("red", "blue"), pch = 1)
   } else {
      points(x = c(epoch, epoch), y = c(loss_train, loss_test),
             col = c("red", "blue"))
   }
}

epochs <- 2500
train_losses <- rep(0, epochs)
test_losses <- rep(0, epochs)
train_network(model, trainingdata$x, trainingdata$y, epochs = epochs,
      callback = function(i) {
         train_losses[i] <<- loss(model, trainingdata)
         test_losses[i] <<- loss(model, testdata)
         plotLoss(i, train_losses[i], test_losses[i], epochs)
      })

accuracy <- juliaEval("accuracy(model, data) =
      mean(Flux.onecold(model(data.x)) .== Flux.onecold(data.y))")
accuracy(model, trainingdata)
accuracy(model, testdata)

Using the callback makes it possible to plot the training curve during the training. This example runs so fast that it is not really possible to watch the training progress. In our case it would be better to optimize the training function to accumulate the losses in Julia and plot them in the end of the training. However, the example demonstrates the principle of how callback functions can be used to watch the progress during the training.

Using BoltzmannMachines in R

The following example code shows how the JuliaConnectoR can be used with the Julia package BoltzmannMachines in R.

library(JuliaConnectoR)

# Test BoltzmannMachines package
# If not installed, install the current version via
# juliaEval('using Pkg; Pkg.add("BoltzmannMachines"))')

# Set a random seed in Julia
juliaEval("using Random; Random.seed!(5);")

BM <- juliaImport("BoltzmannMachines")

# a test data set from the BoltzmannMachines-package, just to have some data
x <- BM$barsandstripes(100L, 4L)
x

# Train DBMs with
dbm <- BM$fitdbm(x, epochs = 40L, learningrate = 0.05,
                 nhiddens = c(4L,3L))
dbm
dbm2 <- BM$fitdbm(x, epochs = 10L,
                  pretraining = list(BM$TrainLayer(nhidden = 4L),
                                     BM$TrainLayer(nhidden = 3L)))
dbm2

# Use a trained model to generate samples
BM$samples(dbm, 10L)

# Evaluate the model: Likelihood estimation ...
BM$loglikelihood(dbm2, x)
#  ... or exact calculation (possible for such a small model)
BM$exactloglikelihood(dbm2, x)

# RBM-fitting with simple monitoring, e. g. just print the progress in R
rbm <- BM$fitrbm(x, epochs = 20L,
                 monitoring = function(rbm, epoch) {print(epoch)})


# Now real monitoring with callback functions
# (Abusing environments for call-by-reference value collection)
monitor <- new.env(parent = emptyenv())
monitor$loglik <- c()
rbm <- BM$fitrbm(x, epochs = 100L,
                 monitoring = function(rbm, epoch) {
                    monitor$loglik <- c(monitor$loglik, BM$loglikelihood(rbm, x))
                 })
plot(1:100, monitor$loglik, "l")

juliaEval("Random.seed!(5);")

# A complex dbm example with layerwise monitoring
monitor <- new.env(parent = emptyenv())
dbm <- BM$fitdbm(x, epochs = 60L,
                 learningrate = 0.05,
                 learningratepretraining = 0.01,
                 pretraining = list(
                       BM$TrainLayer(nhidden = 4L, epochs = 70L,
                              monitoring = function(rbm, epoch) {
                                 monitor$layer1 <- c(monitor$layer1,
                                                     BM$reconstructionerror(rbm, x))
                              }),
                        BM$TrainLayer(nhidden = 3L, epochs = 50L,
                              monitoring = function(rbm, epoch) {
                                 monitor$layer2 <- c(monitor$layer2,
                                                     BM$reconstructionerror(rbm, x))
                             })),
               monitoring = function(dbm, epoch) {
                  monitor$logproblowerbound <- c(monitor$logproblowerbound,
                                                 BM$exactloglikelihood(dbm, x))
               }
)
plot(1:70, monitor$layer1, "l")
plot(1:50, monitor$layer2, "l")
plot(1:60, monitor$logproblowerbound, "l")


# First approach for Gibbs-Sampling, allows access to hidden nodes
particles <- BM$initparticles(dbm2, 20L)
particles <- BM$`gibbssample!`(particles, dbm2, 100L)
particles

# Second approach for Gibbs sampling: All-in-one, returning only visible nodes
BM$samples(dbm, 5L)

# Conditional Gibbs sampling
BM$samples(dbm, 5L, conditions = juliaEval("[1 => 1.0, 2 => 0.0]"))


# A Gaussian-BernoulliRBM
rbm <- BM$fitrbm(data.matrix(iris[, 1:4]), rbmtype = BM$GaussianBernoulliRBM)
BM$samples(rbm, 10L)

For more abstract examples, see the tests.

Name		Name	Last commit message	Last commit date
Latest commit History 511 Commits
.github/workflows		.github/workflows
R		R
inst		inst
man		man
tests		tests
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
DEVDOC.md		DEVDOC.md
JuliaConnectoR-Logo.svg		JuliaConnectoR-Logo.svg
JuliaConnectoR.Rproj		JuliaConnectoR.Rproj
LICENCE		LICENCE
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md
extract-README-examples.R		extract-README-examples.R
make-manual.R		make-manual.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

JuliaConnectoR

Installation

Overview

Examples

Using Flux

Using BoltzmannMachines in R

About

Releases

Packages

Languages

License

yjunechoe/JuliaConnectoR

Folders and files

Latest commit

History

Repository files navigation

JuliaConnectoR

Installation

Overview

Examples

Using Flux

Using BoltzmannMachines in R

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages