next up previous
Next: Human Resource Estimates and Up: VISTA and WFCAM data Previous: More advanced processing

Hardware/software requirements

In order to provide a backup for the main on-line science archive at ROE and also to facilitate several of the advanced processing/reprocessing stages needed, we propose to store on-line the raw science frames and master calibration data of WFCAM in Cambridge, essentially an extension of the rest of the UKIRT raw data archive. Costs of suitable scale commodity RAID arrays ($\approx$10 Tbytes capacity for each full year of operation) have already fallen sufficiently (currently £5k per Tbyte but likley to be half that price by 2004 when the first tranch of bulk storage will need to be in place) to make this a cost effective and viable (scalable) solution.2

With on-line storage in place at two sites we do not foresee the need for further (expensive) near-line ``tape'' storage systems providing that a high density transfer medium is used from Hawaii, whereby the $\approx$100 Gbytes of data expected per night is stored on one ``tape''. If recovery from the raw tape data (archive) is required this would be feasible, even for on-the-shelf storage.

Precisely which system to use for data transfer between the JAC and Cambridge is still under discussion. If tapes, CASU will need to purchase two drives to both guarantee the transfer speed and provide a backup in the event of unit failure. Transfer of processed data between Cambridge and Edinburgh will be via the internet. CASU currently share a 1 Gbit/s link into the LAN. Assuming WFAU have a similar level of access by 2003-4, 10% of that bandwidth would suffice to transfer a days processed data in a few hours.

We expect to have sufficient computing power in place as part of the Cambridge Data Processing Centre activities to handle the development issues of the pipeline for WFCAM, including trialling different processing methods and benchmarking simulated WFCAM data processing. However, sometime during 2003 we will need to purchase and setup a modest multi-node PC cluster to deal with the real data flow expected to start toward the end of that year and a modest high performance PC-based system to begin serious investigation of advanced processing options and equally heavy duty simulations to assess catalogue completeness issues and the reliability of parameter error estimates.


next up previous
Next: Human Resource Estimates and Up: VISTA and WFCAM data Previous: More advanced processing
Nigel Hambly 2002-08-23