.. _tensorboard:
- You can run Tensorboard as a job, this is the preferred method of
- doing this.
First start an interactive session with a reserved port:
``[studentx@login-1 ~]$ LLsub -i –resv-ports 1 salloc –immediate=60 -p normal –constraint=xeon-e5 –cpus-per-task=4 –qos=high srun –resv-ports=1 –pty bash -i salloc: Granted job allocation 355286 salloc: Waiting for resource configuration salloc: Nodes node-052 are ready for job ``
Then create your logging directory in TMPDIR:
``[studentx@node-052 ~]$ mkdir -p ${TMPDIR}/tensorboard ``
Set up your forwading name and file:
[studentx@node-052 ~]$ PORTAL_FWNAME="$(id -un | tr '[A-Z]' '[a-z]')-tensorboard" [studentx@node-052 ~]$ PORTAL_FWFILE="/home/gridsan/portal-url-fw/${PORTAL_FWNAME}" [studentx@node-052 ~]$ echo $PORTAL_FWFILE /home/gridsan/portal-url-fw/studentx-tensorboard [studentx@node-052 ~]$ echo "Portal URL is: https://${PORTAL_FWNAME}.fn.txe1-portal.mit.edu/" Portal URL is: https://studentx-tensorboard.fn.txe1-portal.mit.edu/
Put the forward URL in the forwarding file (when you run “cat
$PORTAL_FWFILE” you should only see one line- if you see two or more,
delete all but the last line):
``[studentx@node-052 ~]$ echo “http://$(hostname -s):${SLURM_STEP_RESV_PORTS}/” >> $PORTAL_FWFILE [studentx@node-052 ~]$ cat $PORTAL_FWFILE http://node-052:12637/ ``
Set the permissions on the forward file properly:
``[studentx@node-052 ~]$ chmod u+x ${PORTAL_FWFILE} ``
Load an anaconda module and start tensorboard
``[studentx@node-052 ~]$ module load anaconda/2020a [studentx@node-052 ~]$ tensorboard –logdir ${TMPDIR}/tensorboard –host “$(hostname -s)” –port ${SLURM_STEP_RESV_PORTS} ``
In the browser, go to the URL listed above (for example, mine is https://studentx-tensorboard.fn.txe1-portal.mit.edu/)