Advanced Photon Source

An Office of Science National User Facility

CCTW - Crystal Coordinate Transformation Workflow

Projection of a transformed dataset along a principal crystallographic axis.Single crystal diffuse scattering is a powerful tool, which provides vital information on complex disorder, and can make important contributions to a large number of scientific fields.In order to make its use more routine, it is necessary to use fast 2-dimensional detectors which generate vast volumes of data. The LDRD Grand Challenge project “Discovery Engines for Big Data” attempts to use ‘big data’ techniques to implement an accessible means of working with these large datasets.

CCTW is written to perform the transformation from real to reciprocal coordinates.The typical datasets are of order 160Gbytes in size (typically 1500x1500x3600 elements) and CCTW uses a strategy of dividing input and output datasets into smaller ‘chunks’ (typically 100x100x100 elements in size) and transforming chunks in parallel on a multicore machine, or the nodes in clustered system and merging together the resulting transformed chunks into a final dataset.

By performing a preliminary analysis of the dependencies between input and output chunks it is possible to choose an order of processing input data that minimizes the amount of intermediate RAM required to perform the transform.This has been observed to reduce the peak memory requirements for a transform from ~600GB to about 10GB – this has an immense impact on performance on typical compute nodes – which typically have relatively modest memory capacity.

 

Distribution & Impact

CCTW is released under a GNU license.The source code for CCTW is available from the sourceforge project page git repository at

 

 

Funding Source

This project has been produced with support from Argonne LDRD Grand Challenge project “Discovery Engines for Big Data”