pcircle Overview

fcp is a program designed to do large-scale parallel data transfer from a source directory to a destination directory across locally mounted file systems. It is not for wide area data transfers such as ftp, bbcp, or globus-ftp. In that sense, it is closer to cp. One crucial difference from regular cp, is that fcp requires the source and destination to be directories. fcp will fail if these conditions are not met. In the most general case, fcp works in two stages: first it analyzes the workload by walking the tree in parallel; and then it parallelizes the data copy operation.

Support

Usage

At the OLCF, fcp is provided as a modulefile and it is available on the data transfer nodes (DTNs) and the analysis cluster (Rhea). To use fcp:
  1. Load the pcircle modulefile:
    $ module load pcircle
    
  2. Use mpirun to copy a directory:
    $ mpirun -np 8 fcp src_dir dest_dir
    

fcp options

In addition, fcp supports the following options:
-p, --preserve: Preserve metadata attributes. In the case of Lustre, the
striping information is kept.
  
-f, --force: Overwrite the destination directory. The default is off.
  
--verify: Perform checksum-based verification after the copy.
  
-s, --signature: Generate a single sha1 signature for the entire dataset.
This option also implies --verify for post-copy verification.
  
--chunksize sz: fcp will break up large files into pieces to increase
parallelism. By default, fcp adaptively sets the chunk size based on the
overall size of the workload. Use this option to specify a particular
chunk size in KB, MB. For example, --chunksize 128MB.
  
--reduce-interval: Controls progress report frequency. The default is 10
seconds.

Using fcp inside a batch job

The data transfer nodes (DTNs) can be used to submit an fcp job to the batch queue:
  1. Connect to the DTNs:
    ssh <username>@dtn.ccs.ornl.gov
    
  2. Prepare a PBS submission script:
    #!/bin/bash -l
    #PBS -l nodes=2
    #PBS -A <projectID>
    #PBS -l walltime=00:30:00
    #PBS -N fcp_job
    #PBS -j oe
       
    cd $PBS_O_WORKDIR
    module load pcircle
    module list
       
    mpirun -n 16 --npernode 8 fcp ./src_dir ./dest_dir
    
    The --pernode is needed to distribute the MPI processes across physical nodes. Ommitting this option would place 16 MPI processes on the same node, which in this example is not the desired behavior, as it would reduce the amount of memory available to each process.

Performance considerations

fcp performance is subject to the bandwidth and conditions of the source file system, the storage area network, the destination file system, and the number of processes and nodes involved in the transfer. Using more processes per node does not necessarily result in better performance due to an increase in the number of metadata operations as well as additional contention generated from a larger number of processes. A rule of thumb is to match or halve the number of physical cores per transfer node. Both post copy verification (–verify) and dataset signature (–signature) options have performance implications. When turned on, fcp calculates the checksums of each chunk/file for both source and destination, in addition to reading back from destination. This increases both the amount of bookkeeping and memory usage. Therefore, for large scale data transfers, a large memory node is recommended.

Author

Feiyi Wang (fwang2@ornl.gov), ORNL Technology Integration group.

Need help?

If you encounter any issues or have any questions, please contact OLCF User Assistance Center.
Builds

RHEA

  • 0.15rc14