Categories: Data Management, Software
Print this article
fcp is a program designed to do large-scale parallel data transfer from a source directory to a destination directory across locally mounted file systems. It is not for wide area data transfers such as ftp, bbcp, or globus-ftp. In that sense, it is closer to cp. One crucial difference from regular cp, is that fcp requires the source and destination to be directories. fcp will fail if these conditions are not met. In the most general case, fcp works in two stages: first it analyzes the workload by walking the tree in parallel; and then it parallelizes the data copy operation.
At the OLCF, fcp is provided as a modulefile and it is available on the data transfer nodes (DTNs) and the analysis cluster (Rhea). To use fcp:
- Load the pcircle modulefile:
$ module load pcircle
- Use mpirun to copy a directory:
$ mpirun -np 8 fcp src_dir dest_dir
In addition, fcp supports the following options:
-p, --preserve: Preserve metadata attributes. In the case of Lustre, the striping information is kept. -f, --force: Overwrite the destination directory. The default is off. --verify: Perform checksum-based verification after the copy. -s, --signature: Generate a single sha1 signature for the entire dataset. This option also implies --verify for post-copy verification. --chunksize sz: fcp will break up large files into pieces to increase parallelism. By default, fcp adaptively sets the chunk size based on the overall size of the workload. Use this option to specify a particular chunk size in KB, MB. For example, --chunksize 128MB. --reduce-interval: Controls progress report frequency. The default is 10 seconds.
Using fcp inside a batch job
The data transfer nodes (DTNs) can be used to submit an fcp job to the batch queue:
- Connect to the DTNs:
- Prepare a PBS submission script:
#!/bin/bash -l #PBS -l nodes=2 #PBS -A <projectID> #PBS -l walltime=00:30:00 #PBS -N fcp_job #PBS -j oe cd $PBS_O_WORKDIR module load pcircle module list mpirun -n 16 --npernode 8 fcp ./src_dir ./dest_dir
--pernodeis needed to distribute the MPI processes across physical nodes. Ommitting this option would place 16 MPI processes on the same node, which in this example is not the desired behavior, as it would reduce the amount of memory available to each process.
fcp performance is subject to the bandwidth and conditions of the source file system, the storage area network, the destination file system, and the number of processes and nodes involved in the transfer. Using more processes per node does not necessarily result in better performance due to an increase in the number of metadata operations as well as additional contention generated from a larger number of processes. A rule of thumb is to match or halve the number of physical cores per transfer node.
Both post copy verification (–verify) and dataset signature (–signature) options have performance implications. When turned on, fcp calculates the checksums of each chunk/file for both source and destination, in addition to reading back from destination. This increases both the amount of bookkeeping and memory usage. Therefore, for large scale data transfers, a large memory node is recommended.
Feiyi Wang (email@example.com), ORNL Technology Integration group.
If you encounter any issues or have any questions, please contact OLCF User Assistance Center.