sacct − displays accounting data for all jobs and job steps in the SLURM job accounting log
sacct options |
Accounting information for jobs invoked with SLURM are logged in the job accounting log file.
The sacct command displays job accounting data stored in the job accounting log file in a variety of forms for your analysis. The sacct command displays information on jobs, job steps, status, and exitcodes by default. You can tailor the output with the use of the −−fields= option to specify the fields to be shown.
For the root user, the sacct command displays job accounting data for all users, although there are options to filter the output to report only the jobs from a specified user or group.
For the non−root user, the sacct command limits the display of job accounting data to jobs that were launched with their own user identifier (UID) by default. Data for other users can be displayed with the −−all, −−user, or −−uid options.
Note: |
Much of the data reported by sacct has been generated by the wait3() and getrusage() system calls. Some systems gather and report incomplete information for these calls; sacct reports values of 0 for this missing data. See your systems getrusage(3) man page for information about which data are actually available on your system. |
Options
−a , −−all
Displays the job accounting data for all jobs in the job accounting log file.
This is the default behavior when the sacct command is executed by the root user.
−b , −−brief
Displays a brief listing, which includes the following data:
• |
jobid |
|||
• |
status |
|||
• |
exitcode |
This option has no effect when the −−−dump option is also specified.
−d , −−dump
Displays (dumps) the raw data records.
This option overrides the −−brief and −−fields= options.
The section titled "INTERPRETING THE −−dump OPTION OUTPUT" describes the data output when this option is used.
−e time_spec , −−expire=time_spec
Removes job data from SLURMs current accounting log file (or the file specified with −−file) for jobs that completed more than time_spec ago and appends them to the expired log file.
If time_spec is an integer value only, it is interpreted as minutes. If time_spec is an integer followed by "h", it is interpreted as a number of hours. If time_spec is an integer followed by "d", it is interpreted as number of days. For example, "−−expire=14d" purges the job accounting log of all jobs that completed more than 14 days ago.
The expired log file is a file with the same name as the accounting log file, with ".expired" appended to the file name. For example, if the accounting log file is /var/log/slurmacct.log, the expired log file will be /var/log/slurmacct.log.expired.
−F field_list , −−fields=field_list
Displays the job accounting data specified by the field_list operand, which is a comma−separated list of fields. Space characters are not allowed in the field_list.
See the −−help−fields option for a list of the available fields. See the section titled "Job Accounting Fields" for a description of each field.
The job accounting data is displayed in the order specified by the field_list operand. Thus, the following two commands display the same data but in different order:
# sacct −−fields=jobid,status
Jobid Status
−−−−−−−−−− −−−−−−−−−−
3 COMPLETED
3.0 COMPLETED
# sacct −−fields=status,jobid
Status Jobid
−−−−−−−−−− −−−−−−−−−−
COMPLETED 3
COMPLETED 3.0
The default value for the field_list operand is "jobid,partition,process,ncpus,status,exitcode".
This option has no effect when the −−dump option is also specified.
−f file, −−file=file
Causes the sacct command to read job accounting data from the named file instead of the current SLURM job accounting log file.
−g gid, −−gid=gid
Displays the statistics only for the jobs started with GID gid.
−g group, −−group=group
Displays the statistics only for the jobs started by users in the group group.
−h , −−help
Displays a general help message.
−−help−fields
Displays a list of fields that can be specified with the −−fields option.
Fields available:
account blockid cpu cputime
elapsed end exitcode gid
group idrss inblock isrss
ixrss job jobid jobname
majflt minflt msgrcv msgsnd
ncpus nivcsw nodes nprocs
nsignals nswap ntasks nvcsw
outblocks pages partition rss
start status submit systemcpu
uid user usercpu vsize
The section titled "Job Accounting Fields" describes these fields.
−j job(.step) , −−jobs=job(.step)
Displays information about the specified job(.step) or list of job(.step)s.
The job(.step) parameter is a comma−separated list of jobs. Space characters are not permitted in this list.
The default is to display information on all jobs.
−l, −−long
Displays a long listing, which includes the following data:
• |
jobid |
|||
• |
jobname |
|||
• |
partition |
|||
• |
vsize |
|||
• |
rss |
|||
• |
pages |
|||
• |
cputime |
|||
• |
ntasks |
|||
• |
ncpus |
|||
• |
elapsed |
|||
• |
status |
|||
• |
exitcode |
−−noheader
Prevents the display of the heading over the output. The default action is to display a header.
This option has no effect when used with the −−dump option.
−O , −−formatted_dump
Dumps accounting records in an easy−to−read format.
This option is provided for debugging.
−p partition_list , −−partition=partition_list
Displays information about jobs and job steps specified by the partition_list operand, which is a comma−separated list of partitions. Space characters are not allowed in the partition_list.
The default is to display information on jobs and job steps on all partitions.
−S , −−stat
Queries the status of a job as the job is running displaying the following data:
• |
jobid |
|||
• |
vsize |
|||
• |
rss |
|||
• |
pages |
|||
• |
cputime |
|||
• |
ntasks |
|||
• |
status |
You must also include the −−jobs=job(.step) option if no (.step) is given you will recieve the job.0 step.
−s state_list , −−state=state_list
Selects jobs based on their current state, which can be designated with the following state designators:
r |
running |
|||
s |
suspended |
|||
ca |
cancelled |
|||
cd |
completed |
|||
pd |
pending |
|||
f |
failed |
|||
to |
timed out |
|||
nf |
node_fail |
The state_list operand is a comma−separated list of these state designators. Space characters are not allowed in the state_list.
−t , −−total
Displays only the cumulative statistics for each job. Intermediate steps are displayed by default.
−u uid, −−uid=uid
Displays the statistics only for the jobs started by the user whose UID is uid.
−u user, −−user=user
Displays the statistics only for the jobs started by user user.
−−usage |
Displays a help message. |
−v , −−verbose
Reports the state of certain variables during processing. This option is primarily used for debugging.
Job Accounting Fields
The following describes each job accounting field:
account |
User supplied account number for the job |
||
blockid |
Block ID, applicable to BlueGene computers only |
||
cpu |
The sum of the system time (systemcpu) and user time (usercpu) in seconds |
||
cputime |
Minimum CPU time of any process followed by its task id along with the average of all processes running in the step. |
||
elapsed |
The jobs elapsed time. |
The format of this fields output is as follows:
[DD−[hh:]]mm:ss |
as defined by the following:
DD |
||||
days |
||||
hh |
hours |
|||
mm |
minutes |
|||
ss |
seconds |
|||
end |
Termination time of the job. Format output is as follows:
MM/DD−hh:mm:ss |
as defined by the following:
MM |
||||
month |
||||
DD |
day |
|||
hh |
hours |
|||
mm |
minutes |
|||
ss |
seconds |
|||
exitcode |
The first non−zero error code returned by any job step.
gid |
The group identifier of the user who ran the job. |
||
group |
The group name of the user who ran the job. |
||
idrss |
Maximum unshared data size (in KB) of any process. |
||
inblocks |
Total block input operations for all processes. |
||
isrss |
Maximum unshared stack space size (in KB) of any process. |
||
ixrss |
Maximum shared memory (in KB) of any process. |
||
job |
The SLURM job identifier of the job. |
||
jobid |
The number of the job or job step. It is in the form: job.jobstep. |
||
jobname |
The name of the job or job step. |
||
majflt |
Maximum number of major page faults for any process. |
||
minflt |
Maximum number of minor page faults (page reclaims) for any process. |
||
msgrcv |
Total number of messages received for all processes. |
||
msgsnd |
Total number of messages sent for all processes. |
||
ncpus |
Total number of CPUs allocated to the job. |
||
nivcsw |
Total number of involuntary context switches for all processes. |
||
nodes |
A list of nodes allocated to the job. |
||
nprocs |
Total number of tasks in job. Identical to ntasks. |
||
nsignals |
Total number of signals received for all processes. |
||
nswap |
Maximum number of swap operations of any process. |
||
ntasks |
Total number of tasks in job. |
||
nvcsw |
Total number of voluntary context switches for all processes. |
||
outblocks |
Total block output operations for all processes. |
||
pages |
Maximum page faults of any process followed by its task id along with the average of all processes running in the step. |
||
partition |
Identifies the partition on which the job ran. |
||
rss |
Maximum resident set size of any process followed by its task id along with the average of all processes running in the step. |
||
start |
Initiation time of the job in the same format as end. |
||
status |
Displays the job status, or state. |
Output can be RUNNING, SUSPENDED, COMPLETED, CANCELLED, FAILED, TIMEOUT, or NODE_FAIL.
submit |
The time and date stamp (in Universal Time Coordinated, UTC) the job was submitted. The format of the output is identical to that of the end field. |
||
systemcpu |
The amount of system CPU time. (If job was running on multiple cpus this is a combination of all the times so this number could be much larger than the elapsed time.) The format of the output is identical to that of the elapsed field. |
||
uid |
The user identifier of the user who ran the job. |
||
uid.gid |
The user and group identifiers of the user who ran the job. (This field is used in record headers, and simply concatenates the uid and gid fields.) |
||
user |
The user name of the user who ran the job. |
||
usercpu |
The amount of user CPU time. (If job was running on multiple cpus this is a combination of all the times so this number could be much larger than the elapsed time.) The format of the output is identical to that of the elapsed field. |
||
vsize |
Maximum Virtual Memory size of any process followed by its task id along with the average of all processes running in the step. |
The sacct commands −−dump option displays data in a horizontal list of fields depending on the record type; there are three record types: JOB_START, JOB_STEP, and JOB_TERMINATED. There is a subsection that describes the output for each record type.
When the data output is a job accounting field, as described in the section titled "Job Accounting Fields", only the name of the job accounting field is listed. Otherwise, additional information is provided.
Note: |
The output for the JOB_STEP and JOB_TERMINATED record types present a pair of fields for the following data: Total CPU time, Total User CPU time, and Total System CPU time. The first field of each pair is the time in seconds expressed as an integer. The second field of each pair is the fractional number of seconds multiplied by one million. Thus, a pair of fields output as "1 024315" means that the time is 1.024315 seconds. The least significant digits in the second field are truncated in formatted displays. |
Output for the JOB_START Record Type
The following describes the horizontal fields output by the sacct −−dump option for the JOB_START record type.
Field # |
Field |
||
1 |
job |
||
2 |
partition |
||
3 |
submitted |
||
4 |
The jobs start time; this value is the number of non−leap seconds since the Epoch (00:00:00 UTC, January 1, 1970) |
||
5 |
uid.gid |
||
6 |
(Reserved) |
||
7 |
JOB_START (literal string) |
||
8 |
Job Record Version (1) |
||
9 |
The number of fields in the record (16) |
||
10 |
uid |
||
11 |
gid |
||
12 |
The job name |
||
13 |
Batch Flag (0=no batch) |
||
14 |
Relative SLURM priority |
||
15 |
ncpus |
||
16 |
nodes |
Output for the JOB_STEP Record Type
The following describes the horizontal fields output by the sacct −−dump option for the JOB_STEP record type.
Field # |
Field |
||
1 |
job |
||
2 |
partition |
||
3 |
submitted |
||
4 |
The jobs start time; this value is the number of non−leap seconds since the Epoch (00:00:00 UTC, January 1, 1970) |
||
5 |
uid.gid |
||
6 |
(Reserved) |
||
7 |
JOB_STEP (literal string) |
||
8 |
Job Record Version (1) |
||
9 |
The number of fields in the record (38) |
||
10 |
jobid |
||
11 |
end |
||
12 |
Completion Status; the mnemonics, which may appear in uppercase or lowercase, are as follows: |
CA
Cancelled |
|||
CD |
Completed successfully |
||
F |
Failed |
||
NF |
Job terminated from node failure |
||
R |
Running |
||
S |
Suspended |
||
TO |
Timed out |
||
13 |
exitcode
14 |
ntasks |
||
15 |
ncpus |
||
16 |
elapsed time in seconds expressed as an integer |
||
17 |
Integer portion of the Total CPU time in seconds for all processes |
||
18 |
Fractional portion of the Total CPU time for all processes expressed in microseconds |
||
19 |
Integer portion of the Total User CPU time in seconds for all processes |
||
20 |
Fractional portion of the Total User CPU time for all processes expressed in microseconds |
||
21 |
Integer portion of the Total System CPU time in seconds for all processes |
||
22 |
Fractional portion of the Total System CPU time for all processes expressed in microseconds |
||
23 |
rss |
||
24 |
ixrss |
||
25 |
idrss |
||
26 |
isrss |
||
27 |
minflt |
||
28 |
majflt |
||
29 |
nswap |
||
30 |
inblocks |
||
31 |
outblocks |
||
32 |
msgsnd |
||
33 |
msgrcv |
||
34 |
nsignals |
||
35 |
nvcsw |
||
36 |
nivcsw |
||
37 |
vsize |
Output for the JOB_TERMINATED Record Type
The following describes the horizontal fields output by the sacct −−dump option for the JOB_TERMINATED (literal string) record type.
Field # |
Field |
||
1 |
job |
||
2 |
partition |
||
3 |
submitted |
||
4 |
The jobs start time; this value is the number of non−leap seconds since the Epoch (00:00:00 UTC, January 1, 1970) |
||
5 |
uid.gid |
||
6 |
(Reserved) |
||
7 |
JOB_TERMINATED (literal string) |
||
8 |
Job Record Version (1) |
||
9 |
The number of fields in the record (38) |
Although thirty−eight fields are displayed by the sacct command for the JOB_TERMINATED record, only fields 1 through 12 are recorded in the actual data file; the sacct command aggregates the remainder.
10 |
The total elapsed time in seconds for the job. |
||
11 |
end |
||
12 |
Completion Status; the mnemonics, which may appear in uppercase or lowercase, are as follows: |
CA
Cancelled |
|||
CD |
Completed successfully |
||
F |
Failed |
||
NF |
Job terminated from node failure |
||
R |
Running |
||
TO |
Timed out |
||
13 |
exitcode
14 |
ntasks |
||
15 |
ncpus |
||
16 |
elapsed time in seconds expressed as an integer |
||
17 |
Integer portion of the Total CPU time in seconds for all processes |
||
18 |
Fractional portion of the Total CPU time for all processes expressed in microseconds |
||
19 |
Integer portion of the Total User CPU time in seconds for all processes |
||
20 |
Fractional portion of the Total User CPU time for all processes expressed in microseconds |
||
21 |
Integer portion of the Total System CPU time in seconds for all processes |
||
22 |
Fractional portion of the Total System CPU time for all processes expressed in microseconds |
||
23 |
rss |
||
24 |
ixrss |
||
25 |
idrss |
||
26 |
isrss |
||
27 |
minflt |
||
28 |
majflt |
||
29 |
nswap |
||
30 |
inblocks |
||
31 |
outblocks |
||
32 |
msgsnd |
||
33 |
msgrcv |
||
34 |
nsignals |
||
35 |
nvcsw |
||
36 |
nivcsw |
||
37 |
vsize |
This example illustrates the default invocation of the sacct command:
# sacct
Jobid Jobname Partition Ncpus Status Exitcode
−−−−−−−−−− −−−−−−−−−− −−−−−−−−−− −−−−−−− −−−−−−−−−− −−−−−−−−
2 script01 srun 1 RUNNING 0
3 script02 srun 1 RUNNING 0
4 endscript srun 1 RUNNING 0
4.0 srun 1 COMPLETED 0
This example shows the same job accounting information with the brief option.
# sacct −−brief
Jobid Status Exitcode
−−−−−−−−−− −−−−−−−−−− −−−−−−−−
2 RUNNING 0
3 RUNNING 0
4 RUNNING 0
4.0 COMPLETED 0
# sacct −−total
Jobid Jobname Partition Ncpus Status Exitcode
−−−−−−−−−− −−−−−−−−−− −−−−−−−−−− −−−−−−− −−−−−−−−−− −−−−−−−−
3 sja_init andy 1 COMPLETED 0
4 sjaload andy 2 COMPLETED 0
5 sja_scr1 andy 1 COMPLETED 0
6 sja_scr2 andy 18 COMPLETED 2
7 sja_scr3 andy 18 COMPLETED 0
8 sja_scr5 andy 2 COMPLETED 0
9 sja_scr7 andy 90 COMPLETED 1
10 endscript andy 186 COMPLETED 0
This example demonstrates the ability to customize the output of the sacct command. The fields are displayed in the order designated on the command line.
# sacct −−fields=jobid,ncpus,ntasks,nsignals,status
Jobid Ncpus Ntasks Nsignals Status
−−−−−−−−−− −−−−−−− −−−−−−− −−−−−−−−− −−−−−−−−−−
3 2 1 0 COMPLETED
3.0 2 1 0 COMPLETED
4 2 2 0 COMPLETED
4.0 2 2 0 COMPLETED
5 2 1 0 COMPLETED
5.0 2 1 0 COMPLETED
Copyright (C) 2005−2007 Copyright Hewlett−Packard Development Company L.P.
This file is part of SLURM, a resource management program. For details, see <https://computing.llnl.gov/linux/slurm/>.
SLURM is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
SLURM is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
/etc/slurm.conf
Entries to this file enable job accounting and designate the job accounting log file that collects system job accounting.
/var/log/slurm_accounting.log
The default job accounting log file. By default, this file is set to read and write permission for root only.