The National Institute for Computational Sciences

Data Transfer - ACF-SIP

  Contents

 

Introduction


Data transfer on the ACF-SIP is performed with the Globus research data management platform. Traditional file transfer tools such as SCP, SFTP, or other utilities should not be used on the ACF-SIP. In this document, you will learn about preparing data before transferring it, configuring Globus for your use, and initiating transfers using the Globus web interface. Please be aware that all data transfer operations should be done on the ACF-SIP's DTN (data transfer node). You may access this DTN from the ACF-SIP login nodes. Figure 1.1 depicts how to access the DTN from the login nodes. If you need to access the ACF-SIP login nodes, please refer to the Access and Login document.

ssh <NetID>@sip-dtn1.acf.tennessee.edu
Figure 1.1 - Accessing the ACF-SIP DTN

Preparing Data


Before you initiate any data transfers to or from the SIP, consider preparing the data you wish to transfer by archiving and compressing it. When you archive data, several files and directories can be added to the same location. When you compress data, you reduce its total size. Both methods reduce the total amount of data that must be sent across the network and make it easier for you to organize the data you wish to transfer. At the time of this writing, the tar and zip utilities are the best methods for data archiving and compression for SIP users across Linux, MacOS, and Windows.

When you prepare your data, please avoid using a login node. Instead, use the ACF-SIP’s DTN (data transfer node). Figure 1.1 in the Introduction shows how to access the DTN.

Using the tar Utility

The tar (tape archiver) utility uses simple command syntax and allows large amounts of data to be aggregated into the same archive. Linux, MacOS, and updated Windows 10 systems can use tar. Older Windows systems will be limited to the zip utility.

To create a tar archive, execute tar czvf <archive-name> <dir-to-archive>. Replace the <archive-name> argument with the name of the new archive. Be sure to follow the name with the .tar.gz extension, as in my_archive.tar.gz. Replace the <dir-to-archive> argument with the directory you wish to place within the archive. If the directory you intend to archive is not within your working directory, specify the relative or absolute path to it. By default, tar will recursively place the directory and its contents into the new archive. Figure 2.1 shows the successful creation of a tar archive.

[user@sip-dtn1 ~]$ tar czvf new_archive.tar.gz Documents
Documents/
Documents/IntroUnix.pdf
Documents/JobSubData.zip
Documents/MATLAB/
Documents/Scripts.zip
Documents/PyLists.py
Figure 2.1 - Creating a tar Archive

After the archive is created, execute ls -l to verify that the archive exists. You can view its contents with the tar tvf <archive-name> command. You may then transfer the archive using Globus. Please refer to the Configuring Globus section to learn how to configure it for your system.

On the remote system, execute tar xvf <archive-name> to extract the contents of the archive. The files will be extracted into your working directory.

Using the zip Utility

On older Windows systems, the zip utility should be used to archive and compress your data on the SIP.

To create a zip archive on the SIP, execute zip -r <archive-name>.zip <dir-to-archive>. Be sure that the directory you wish to archive is in your working directory. Otherwise, specify the relative or absolute path to the directory you wish to archive. Replace the <archive-name> argument with the name of the new zip archive. You may or may not include the .zip file extension to the archive’s name; if you do not, the zip utility will add it automatically. Replace the <dir-to-archive> argument with the directory you wish to place in the zip archive. The -r option ensures that the directory and its contents are archived and compressed. Figure 2.2 shows the successful creation of a zip archive.

[user@sip-dtn1 ~]$ zip -r Documents Documents
  adding: Documents/ (stored 0%)
  adding: Documents/IntroUnix.pdf (deflated 4%)
  adding: Documents/MATLAB/ (stored 0%)
  adding: Documents/PyLists.py (deflated 61%)
Figure 2.2 - Creating a zip Archive

After the zip archive has been created, execute ls -l in the directory from which you created it to ensure the archive exists. It will appear with the name you gave to the archive followed by the .zip extension.

With the zip archive created and verified, transfer it to your system using Globus. Please refer to the Configuring Globus section to learn how to use it on your system. Once you transfer the zip archive to your system, open the File Explorer and navigate to the directory in which you placed the archive. Right-click on the archive and select the “Extract All…” option in the submenu. Figure 2.3 shows where to locate this option. Specify the directory in which the contents should be extracted, then select “Extract.” You may then open the archive and peruse its contents.

Figure 2.3 - Extracting the Contents of a zip Archive in Windows

Configuring Globus


Before users can access the Globus interface and access endpoints, they must associate their user certificate’s distinguished name (DN) to their account. The association process is described in the Navigating the User Portal document. If you have not associated your certificate’s DN to your account, please perform that task before continuing to configure Globus.

If you have associated your certificate’s DN to your account, then navigate to the Globus website. In the top-right of the webpage, select “Log In.” In the dropdown menu, find and select the University of Tennessee as your identity provider, then click “Continue.” Authenticate to UT CAS with your NetID, NetID password, and Duo TFA. After you are successfully logged in, you will see the interface shown in Figure 3.1.

Figure 3.1 - Initial Globus Interface

The next step is to set up a Globus endpoint on your local machine. On the left-side of the web interface, select the “Endpoints” option. In the top-right of the webpage, click on the option to “Create a personal endpoint.” Provide a descriptive name to the endpoint. Next, select the option to “Generate Setup Key.” Copy the setup key to your clipboard. Select the installer that matches your operating system. Allow the installer to run, then paste the setup key into the installer when it prompts you to provide it. After your personal endpoint has been successfully installed, start it.

Return to the Globus web interface and select the “File Manager” from the left-side of the webpage. In the top-right of the page, select the two rectangles to the right of the “Panels” option. This will expand the File Manager to show two endpoints. Figure 3.2 highlights the Panels option.

Figure 3.2 - Changing to the Dual Panel View in the Globus File Manager

In either side of the File Manager, click on the “Collection” bar. Type the name you assigned to your local endpoint in the search bar. It should appear in the results. Click on the endpoint. You will then return to the File Manager.

Before you can access the SIP’s Globus endpoint, you must link your SIP account to Globus. On the left-side of the web interface, select the “Account” option. From there, select the “Link Another Identity” option. Figure 3.3 shows how this option appears. Specify the University of Tennessee as your identity provider, then click “Continue.” Authenticate through UT CAS with your NetID, NetID password, and Duo TFA. If you receive the error that states “An identity cannot be linked to itself,” then you do not need to take any further action. Your SIP account is already linked to Globus.

Figure 3.3 - Linking an Identity in Globus

After you successfully link your SIP account to Globus, return to the File Manager. In the empty panel, click on the “Collection” bar. Search for SIP ENCLAVE STORAGE, which is the SIP’s Globus endpoint. Select this endpoint. When you return to the File Manager, you may need to authenticate to the endpoint. The process for authentication is the same as it is for accessing the Globus web interface.

If all the preceding steps were successful, the Globus File Manager interface should appear as it does in Figure 3.4.

Figure 3.4 - Globus File Manager with Two Endpoints Selected

Using Globus


To use Globus, access the Globus website and open the File Manager from the left-side of the interface. If you have not configured your endpoints and identity in Globus, please review the Configuring Globus section. If you have configured these options, then you may use Globus to transfer your data.

It is important to note that Globus data transfers to and from your personal endpoint will only work if you are connected to the UTK VPN. If you are not and you initiate a transfer, Globus will claim it was successful, but the file will be empty. Avoid this situation and connect to the VPN before you use Globus. To learn how to setup and configure the VPN on your device, please review OIT’s VPN User Guide. Transfers to or from another Globus endpoint do not require the use of a VPN.

Transferring Data in Unencrypted Space

For data that is not stored in encrypted space on the SIP, transfer it normally. The directories do not need to be mounted or decrypted. This applies to your NFS home directory and your personal Lustre project space. For more information on these directories, please review the File Systems document.

Transferring Data in Encrypted Space

For data that is stored in encrypted space on the SIP, additional steps are necessary to initiate transfers to and from these spaces. These steps are outlined below.

  1. Login to the Citrix Secure Enclave environment.
  2. Launch the PuTTY application.
  3. Enter “sip-login1-se.acf.tennessee.edu” into the Host Name / Address field. Select “Connect.” Provide your NetID and NetID password, then authenticate with Duo TFA.
  4. Access the SIP data transfer node with ssh. Figure 4.1 depicts how to connect to the DTN using ssh.
  5. ssh <NetID>@sip-dtn1.acf.tennessee.edu
    Figure 4.1 – Connecting to the SIP DTN with ssh

  6. Execute the sipmount command on the SIP DTN. Figure 4.2 shows how to use this command. When you execute it, you must provide your NetID password and authenticate with Duo TFA. Replace the <project-name> argument with your project identifier, such as UTK-9999. You can determine the name of the projects to which you belong in the User Portal. More information is available in the Navigating the User Portal document.
  7. sudo sipmount <project-name>
    Figure 4.2 – Mounting an Encrypted Project Directory

  8. Verify that the space was mounted with the ls -l command. Figure 4.3 shows the syntax to use for this command.
  9. ls -l /projects/<project-name>/
    Figure 4.3 – Verifying the Contents of an Encrypted Project Directory

  10. Return to the Globus File Manager and navigate to the /projects/<project-name> directory. Its contents should be visible. If not, wait approximately five minutes, then refresh the directory.

After you complete your data transfers, you may unmount the encrypted space on the SIP. Use the sipunmount command to unmount this space. Its syntax and usage is the same as the sipmount command. If you do not unmount the encrypted space, it will automatically be unmounted after fifteen minutes. For more information, please refer to the File Systems document.

Transferring Data to External Globus Endpoints

Globus enables users to transfer data from the SIP to other external Globus endpoints. These external endpoints must be authorized before they can be used. From the user’s perspective, all that must be done to access an external endpoint is to authenticate to it. Figure 4.4 shows the authentication window to an external endpoint in Globus. To reach this window, search for and select the external endpoint from the “Collection” bar in one of the panels. At the time of this writing, only UTHSC users have an authorized Globus endpoint. Its hostname is shown under “Login Server” in Figure 4.4.

Figure 4.4 - Authenticating to an External Endpoint


Return to Top


Last Updated: 04 / 29 / 2020