This is the official implementation of "SPAtio-Temporal graph System (SPATS)" in the following paper:
- Yoon, Heeyong, Kang-Wook Chon, and Min-Soo Kim. "SPATS: a practical system for comparative analysis of spatio-temporal graph neural networks." Cluster Computing 28.13 (2025): 826, https://doi.org/10.1007/s10586-025-05523-6.
If you use this repository in your research project, please cite the following BiBTeX in your paper:
@article{yoon2025spats,
author={Yoon, Heeyong and Chon, Kang-Wook and Kim, Min-Soo},
title={SPATS: a practical system for comparative analysis of spatio-temporal graph neural networks},
journal={Cluster Computing},
year={2025},
month={Sep},
day={19},
volume={28},
number={13},
pages={826},
issn={1573-7543},
doi={10.1007/s10586-025-05523-6},
}⚠ This is not a commercial system, so we do not guarantee its behavior in all environments. Please handle exceptional cases with appropriate external knowledge, official documentation, and experience.
⚠ Some parts contain template text; do not copy and paste them directly, and make sure to replace the template text appropriately.
⚠ Because NFS recognizes users only by ID, it is strongly recommended to set all nodes' user and group IDs to the same value. Otherwise, unexpected behavior can occur (search about the Linux command
id,usermodorgroupmod).
-
Prepare your GPU cluster and set one node as master server and other nodes as worker server. The master server and worker server roles can be overlapped, but separating the roles is recommended because CPU usage might affect model training.
Our testing environment for each node:
-
Ubuntu 18.04
-
CUDA 11.3 to 12.1 (different per node)
-
Python 3.10 (installed by miniconda) and necessary packages (PyTorch, Numpy, ...)
-
-
Install
NFSserver on the master servera. Make the repository visible to anyone.
sudo apt install nfs-kernel-server -y # install NFS serverb. Modify
/etc/exports.sudo vim /etc/exports
c. Write the following text line at the bottom of the
/etc/exports/./your_master_server_folder/SPATS/ *(rw,sync,no_subtree_check)
d. Restart
NFS.sudo exportfs -a sudo systemctl restart nfs-kernel-server
e. Download this repository on the master server.
cd /your_master_server_folder/ git clone https://github.com/sunrise2575/SPATS
f. Make the repository visible to anyone.
sudo chmod 777 ./SPATS/ # Make the repository visible to anyone -
Install
NFSclient and mount the remote folder on each worker servera. Install
NFSclient.sudo apt install nfs-common -y
b. Mount the master server's folder.
mkdir -p /your_worker_server_folder/SPATS/ mount <master_server_IP>:/your_master_server_folder/SPATS /your_worker_server_folder/SPATS/
For checking IP, use
ip aorifconfig -acommand.c. Make sure the remote folder is mounted well.
ls -al /your_worker_server_folder/SPATS/
-
Install
Pythondependencies on every server (all master server and worker server)conda activate base # or specify different virtualenv pip install -r requirements.txtSome Python packages in
requirements.txtare not version-sensitive, so you can selectively modifyrequirements.txtnot to upgrade or downgrade your existing packages.
-
Launch Broker and Worker
a. Launch Broker process on master server
python ./broker.py --port <broker_port>
The default port number is
9999; if you do not specify the port number, you can type like this:python ./broker.py
b. Launch Worker process on each worker server
python ./worker.py --broker <broker_IP>:<broker_port> --gpu <GPU_indices_as_you_wish> # python ./worker.py --broker 1.2.3.4:9999 --gpu 0,2,5 # example; using 0,2,5-th GPU only
We recommend using
tmuxto manage processes on each node efficiently. -
Insert jobs to Broker
⚠ The following content may be lengthy and complex. Since this was developed for research purposes and aims to execute queries in Python without using a structured language like SQL, the details can be intricate. It is recommended to read through carefully and learn by using
query-bulk-insert.pyyourself.a. Find the following text lines in
query.pyandquery-bulk-insert.pyBROKER_IP='127.0.0.1' BROKER_PORT='9999'
and replace it with the IP and port number of your Broker.
BROKER_IP=<broker_IP> BROKER_PORT=<broker_port>
b. Insert multiple jobs using
query-bulk-insert.pyIn
query-bulk-insert.py, the variableDEFAULTis a default job setting. By changing some values ofDEFAULTandVARIATION,query-bulk-insert.pygenerates thousands of combinations of jobs.The following example shows how to use
VARIATIONproperly.VARIATION = { 'maxEpoch': [10], 'batchSize': [64, 128], # 2 variations 'datasetName': ['METR-LA', 'PEMS-BAY'], # 2 variations 'modelName': ['Seq2Seq_RNN', 'ASTGCN'], # 2 variations 'adjacencyMatrixThresholdValue': [float('0.' + str(i)) for i in range(2)], # 2 variations # -> 2x2x2x2=16 variations }
In this case, SPATS sets
maxEpochto10, whilebatchSizeto64and128both. The candidatedatasetNameandmodelNamework similarly. You can write Python generator syntax onVARIATIONsuch asadjacencyMatrixThresholdValueof the example.On the other hand,
DATASET_DEPENDENT_SETTINGandMODEL_DEPENDENT_SETTINGinquery-bulk-insert.pyare variables to help change other values inDEFAULTcomfortably, which are affected bydatasetNameandmodelNameinVARIATION, respectively. For example, the aboveVARIATIONexample specifies that candidate datasets areMETR-LAandPEMS-BAY, following items inDATASET_DEPENDENT_SETTINGare selected,DATASET_DEPENDENT_SETTING={ ... 'METR-LA': { 'additionalTemporalEmbed': ['timestamp_in_day'], # 1 variation 'targetSensorAttributes': [['speed']], # 1 variation 'inputLength': [12], 'outputLength': [12], # 1 variation each # -> 1x1x1x1=1 variation }, 'PEMS-BAY': { 'additionalTemporalEmbed': ['timestamp_in_day'], # 1 variation 'targetSensorAttributes': [['speed']], # 1 variation 'inputLength': [12], 'outputLength': [12], # 1 variation each # -> 1x1x1x1=1 variation }, ... }
Similarly, the following items in
MODEL_DEPENDENT_SETTINGare selected because themodelNameis['Seq2Seq_RNN', 'ASTGCN'].MODEL_DEPENDENT_SETTING={ ... 'Seq2Seq_RNN': { 'adjacencyMatrixLaplacianMatrix': [None] }, # 1 variation 'ASTGCN': { 'adjacencyMatrixLaplacianMatrix': ['cheb_poly'], }, # 1 variation ... }
If the default value is like this:
DEFAULT={ 'trainTestRatio': 0.7, 'adjacencyMatrixThresholdValue': 0.7, 'maxEpoch': 100, 'batchSize': 64, 'lossFunction': ["MAE", "MSE", "MAAPE"], # this should be list; 1 variation 'targetLossFunction': "MSE", # about optimizer 'optimizer': 'Adam', 'learningRate': 0.001, 'weightDecay': 0.0005, }
By the power of algorithm
stdin_generator()and additional model configuration, which is stored in.yamlfiles, 16 variations of jobs are generated, such asstdin = { 'datasetName': 'METR-LA', # set by VARIATION 'adjacencyMatrixThresholdValue': 0.0, # changed by VARIATION 'modelName': 'Seq2Seq_RNN', # set by VARIATION 'maxEpoch': 10, # changed by VARIATION 'batchSize': 64, # changed by VARIATION 'adjacencyMatrixLaplacianMatrix': None, # set by MODEL_DEPENDENT_SETTING 'modelConfig': {}, # automatically filled by common/model/<MODEL_NAME>.yaml. if it doesn't exist, it becomes an empty dict. 'additionalTemporalEmbeds': ['timestamp_in_day'], # set by DATASET_DEPENDENT_SETTING 'inputLength': 12, # set by DATASET_DEPENDENT_SETTING 'outputLength': 12, # set by DATASET_DEPENDENT_SETTING 'targetSensorAttributes': ['speed'], # set by DATASET_DEPENDENT_SETTING 'trainTestRatio': 0.7, # from DEFAULT 'learningRate': 0.001, # from DEFAULT 'weightDecay': 0.0005, # from DEFAULT }
or
stdin = { 'datasetName': 'PEMS-BAY', # set by VARIATION 'adjacencyMatrixThresholdValue': 0.1, # changed by VARIATION 'modelName': 'ASTGCN', # set by VARIATION 'maxEpoch': 10, # changed by VARIATION 'batchSize': 128, # changed by VARIATION 'adjacencyMatrixLaplacianMatrix': 'cheb_poly', # set by MODEL_DEPENDENT_SETTING 'modelConfig': { 'time_strides': 1, 'nb_block': 2 'nb_chev_filter': 64, 'nb_time_filter': 64 }, # automatically filled by common/model/<MODEL_NAME>.yaml. if it doesn't exist, it becomes an empty dict. 'additionalTemporalEmbeds': ['timestamp_in_day'], # set by DATASET_DEPENDENT_SETTING 'inputLength': 12, # set by DATASET_DEPENDENT_SETTING 'outputLength': 12, # set by DATASET_DEPENDENT_SETTING 'targetSensorAttributes': ['speed'], # set by DATASET_DEPENDENT_SETTING 'maxEpoch': 10, # changed by VARIATION 'batchSize': 128, # changed by VARIATION 'trainTestRatio': 0.7, # from DEFAULT 'learningRate': 0.001, # from DEFAULT 'weightDecay': 0.0005, # from DEFAULT }
and so on. you can add
print(stdin)at the loop ofmain()inquery-bulk-insert.pyto print generated stdin for better understanding and debuggingc. After
DEFAULTandVARIATIONare ready, you can insert jobs and wait for the completion.python ./query-bulk-insert.py # run SPATS!Broker works as a queue, so you can insert more jobs without waiting for the completion of previously inserted jobs.
-
Get the job information and delete job
a. Full job info. (only shows recently inserted 100 jobs)
python ./query.py list
b. Job info with filtered type (only shows recently inserted 100 jobs).
python ./query.py list <started|pending|success|failure>
c. Single job info.
python ./query.py select <job_id>
d. Delete a job.
python ./query.py delete <job_id>
-
Visualize results
a. Copy
broker.sqlite3related files tovisualize/foldercp broker.sqlite3* visualize/.b. Run extractor
cd visualize/ python ./extractor.py broker.sqlite3 successc. Run
1-replace-model-and-dataset.ipynband2-concat-results.ipynbin order to getresult.csvd. Use
make-fig*.ipynbfiles to generate comparison results similar to our paper.