Parallel Computing in R

Discussion in 'General Questions' started by MelvinaN, Dec 12, 2020.

  1. MelvinaN

    MelvinaN Junior Member

    Messages:
    17
    I have a question for the parallel computing in R. I am going to use parallel computing to send each of the four bootstrap standard error computations (one for each response: Total_lignin, Glucose, Xylose and arabinose) to a different core and also report estimates of each maximum with corresponding standard errors. The concentration is a predictor.

    I have already created a function called seBootFun that takes in response, predictor, B, and data and returns the standard deviation of the bootstrapped estimates and it works out when treating Glucose as a response but it fails when I used parallel computing. Could you please help me with it? The error is shown.

    # Load the libraries
    library(dplyr)
    library(tidyverse)

    # Read the .csv and only use M.giganteus and S.ravennae.
    dat <- read_csv('concentration.csv') %>%
    filter(variety == 'M.giganteus' | variety == 'S.ravennae') %>%
    arrange(variety)

    # Set seed for reproducibility purpose
    set.seed(12)

    # Sample size
    n <- nrow(dat)

    # Create the function to return the sd of the estimates
    seBootFun <- function(resp, pred, B, data){
    # A function for return the max value of predictor
    max <- function(resp, pred, data){
    # Draw the sample size from the dataset
    sample <- slice_sample(.data = dat, n, prop = 1, replace = TRUE)

    # A quadratic model fit
    y <- as.matrix(sample[, resp])
    x <- as.matrix(sample[, pred])
    fit <- lm(y ~ x + I(x^2))

    # Derive the max of the value of concentration
    max <- -fit$coefficients[2]/(2*fit$coefficients[3])

    return(max)
    }

    maxs <- replicate(B, max(resp, pred,
    data = dat))

    return(c(mean(maxs), sd(maxs)))

    }

    # Output

    result <- seBootFun(resp = 'Glucose', pred = 'concentration', B = 5000, data = dat)
    names(result) <- c('estimated', 'sd')
    result

    # Load the `parallel` library
    library(parallel)

    # Set seed and iterated times
    set.seed(8)

    # Set up cores
    cores <- detectCores()
    cluster <- makeCluster(cores - 1)

    clusterExport(cluster, list("seBootFun", "max", "maxs"))
    clusterEvalQ(cluster, library(tidyverse))
    clusterEvalQ(cluster, library(dplyr))

    result.1 <-
    parLapply(cluster, fun = seBootFun(resp = 'Total_lignin', pred = 'concentration', B = 5000, data = dat))
     

    Attached Files:

Share This Page