rxGetJobs: Get Distributed Computing Jobs
Returns a list of job objects associated with the given compute context and matching the specified parameters.
rxGetJobs(computeContext, exactMatch = FALSE, startTime = NULL, endTime = NULL, states = NULL, verbose = TRUE)
A compute context object.
Determines if jobs are matched using the full compute context, or a simpler subset. If
TRUE, only jobs which use the same context object are returned. If
FALSE, all jobs which have the same
headNode (if available) and
ShareDir are returned.
A time, specified as a
POSIXct object. If specified, only jobs created at or after
startTime are returned. For non-RxHadoopMR contexts, this time should be specified in the user's local time; for RxHadoopMR contexts, the time should specified in GMT. See below for more details.
A time, specified as a
POSIXct object. If specified, only jobs created at or before
endTime are returned. For non-RxHadoopMR contexts, this time should be specified in the user's local time; for RxHadoopMR contexts, the time should specified in GMT. See below for more details.
If specified (as a character vector of states that can include
"running"), only jobs in those states are returned. Otherwise, no filtering is performed on job state.
TRUE (the default), a brief summary of each job is printed as it is found. This includes the current job status as returned by rxGetJobStatus, the modification time of the job, and the current job ID (this is used as the component name in the returned list of job information objects). If no job status is returned, the job status shows
One common use of
rxGetJobs is as input to the rxCleanupJobs function, which
is used to clean up completed non-waiting jobs when
autoCleanup is not specified.
exactMatch=FALSE, only the shared directory
shareDir and the cluster
headNode (if available) are compared. Otherwise, all slots are compared. However, if the
nodes slot in either compute context is
NULL, that slot is also
omitted from the comparison.
On LSF clusters, job information by default is held for only one hour (although this is configurable using
the LSF parameter CLEAN_PERIOD); jobs older than the CLEAN_PERIOD setting will have status
For non-RxHadoopMR cluster types, all time values are specified and displayed in the user's computer's local time settings, regardless of the time zone settings and differences between the user's computer and the cluster. Thus, start and end times for job filtering should be provided in local time, with the expectation that cluster time values will also be converted for the user into system local time. For RxHadoopMR, the job time and comparison times are stored and performed based on a GMT time.
Note also that when there are a large number of jobs on the cluster, you can improve performance by
endTime parameters to narrow your search.
rxJobInfoList, list of job information objects based on the compute context.
Microsoft Technical Support
## Not run: myCluster <- RxComputeContext("RxSpark", # Location of Revo64 on each node revoPath = file.path(defaultRNodePath, "bin", "x64"), nameNode = "cluster-head2", # User directory for read/write shareDir = "\\AllShare\\myName" ) rxSetComputeContext(computeContext = myCluster ) # Get all jobs older than a week and newer than two weeks that are finished or canceled. rxGetJobs(myCluster, startTime = Sys.time() - 3600 * 24 * 14, endTime = Sys.time() - 3600 * 24 * 7, exactMatch = FALSE, states = c( "finished", "canceled") ) # Get all jobs associated with myCluster compute context and then get job output and results myJobs <- rxGetJobs(myCluster) print(myJobs) # returns # rxJob_1461 myJobs$rxJob_1461 rxGetJobOutput(myJobs$rxJob_1461) rxGetJobResults(myJobs$rxJob_1461) ## End(Not run)