% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/utils_api-makeClusterFuture.R
\name{makeClusterFuture}
\alias{makeClusterFuture}
\alias{FUTURE}
\title{Create a Future Cluster of Stateless Workers for Parallel Processing}
\usage{
makeClusterFuture(specs = nbrOfWorkers(), ...)
}
\arguments{
\item{specs}{Ignored.
If specified, the value should equal \code{nbrOfWorkers()} (default).
A missing value corresponds to specifying \code{nbrOfWorkers()}.
This argument exists only to support
\code{parallel::makeCluster(NA, type = future::FUTURE)}.}

\item{\ldots}{Named arguments passed to \code{\link[=future]{future()}}.}
}
\value{
Returns a \pkg{parallel} \code{cluster} object of class \code{FutureCluster}.
}
\description{
\emph{WARNING: Please note that this sets up a stateless set of cluster nodes,
which means that \code{clusterEvalQ(cl, { a <- 3.14 })} will not work.
Consider this a first beta version and use it with great care,
particularly because of the stateless nature of the cluster.
For now, I recommend to manually validate that you can get identical
results using this cluster type with what you get from using the
classical \code{parallel::makeCluster()} cluster type.}
}
\section{Future Clusters are Stateless}{

Traditionally, a cluster nodes has a one-to-one mapping to a cluster
worker process. For example, \code{cl <- makeCluster(2, type = "PSOCK")}
launches two parallel worker processes in the background, where
cluster node \code{cl[[1]]} maps to worker #1 and node \code{cl[[2]]} to
worker #2, and that never changes through the lifespan of these
workers. This one-to-one mapping allows for deterministic
configuration of workers. For examples, some code may assign globals
with values specific to each worker, e.g.
\code{clusterEvalQ(cl[1], { a <- 3.14 })} and
\code{clusterEvalQ(cl[2], { a <- 2.71 })}.

In contrast, there is no one-to-one mapping between cluster nodes
and the parallel workers when using a future cluster. This is because
we cannot make assumptions on where are parallel task will be
processed. Where a parallel task is processes is up to the future
backend to decide - some backends do this deterministically, whereas
others other resolves task at the first available worker. Also, the
worker processes might be \emph{transient} for some future backends, i.e.
the only exist for the life-span of the parallel task and then
terminates.

Because of this, one must not rely in node-specific behaviors,
because that concept does not make sense with a future cluster.
To protect against this, any attempt to address a subset of future
cluster nodes, results in an error, e.g. \code{clusterEvalQ(cl[1], ...)},
\code{clusterEvalQ(cl[1:2], ...)}, and \code{clusterEvalQ(cl[2:1], ...)} in
the above example will all give an error.

Exceptions to the latter limitation are \code{clusterSetRNGStream()}
and \code{clusterExport()}, which can be safely used with future clusters.
See below for more details.
}

\section{clusterSetRNGStream}{

\code{\link[parallel:clusterSetRNGStream]{parallel::clusterSetRNGStream()}}
distributes "L'Ecuyer-CMRG" RNG
streams to the cluster nodes, which record them such that the next
round of futures will use them. When used, the RNG state after the
futures are resolved are recorded accordingly, such that the next
round again of future will use those, and so on. This strategy
makes sure \code{clusterSetRNGStream()} has the expected effect although
futures are stateless.
}

\section{clusterExport}{

\code{\link[parallel:clusterApply]{parallel::clusterExport()}} assign values to the cluster nodes.
Specifically, these values are recorded and are used as globals
for all futures created there on.
}

\section{clusterEvalQ}{

If \code{clusterEvalQ()} is called, the call is ignored, and an error
is produced. The error can be de-escalated to a warning by setting
R option \code{future.ClusterFuture.clusterEvalQ} to \code{"warning"}.
}

\examples{
\dontshow{if ((getRversion() >= "4.4.0")) withAutoprint(\{ # examplesIf}
plan(multisession)
cl <- makeClusterFuture()

parallel::clusterSetRNGStream(cl)

y <- parallel::parLapply(cl, 11:13, function(x) {
  message("Process ID: ", Sys.getpid())
  mean(rnorm(n = x))
})
str(y)

plan(sequential)
\dontshow{\}) # examplesIf}
}
\keyword{internal}
