VISTA DATA FLOW SYSTEM

(VDFS)

---------------

for VISTA & WFCAM data

WSA Software Architecture Design

author
Ian Bond (WFAU Edinburgh)
WSA Programmer
number
VDF-WFA-WSA-009
issue
1.0
date
28 May 2003
co-authors
Nigel Hambly, Mike Read, Eckhard Sutorius

SCOPE
OVERVIEW
FUNDAMENTALS
- Rationale
- An introduction to UML component and deployment diagrams
OS LEVEL COMPONENTS AND DEPLOYMENT
WSA SYSTEM COMPONENTS
WRAPPER MODULES AND THEIR INTERFACES
IMPLEMENTATION OF CURATION USES CASES
THE CURATION DISCOVERY TOOL
USER INTERFACE COMPONENTS
SUMMARY
- Groups of components
- Curation task matrix
APPENDICES
ACRONYMS & ABBREVIATIONS
APPLICABLE DOCUMENTS
CHANGE RECORD
NOTIFICATION LIST
About this document ...

SCOPE

This document presents the design of the software architecture for the WFCAM Science Archive. The objective is to present a clear blueprint for the architecture by identifying those software components that need to be developed and deployed. The intended audience are those members of the WSA team directly involved in the construction of the archive. This document is also expected to be informative to interested external parties.

It is important to emphasise here the iterative nature of this document. It is anticipated that this document will undergo a number of iterations during the construction of the archive software architecture. As experience and knowledge grows during the coding phases, there will inevitably be a "refactoring" of parts of the architecture. Future releases of this document will reflect all changes taking place with the final issue matching the final design of the archive.

The purpose of the first issue of this document is to provide a starting point from which the coding phase can begin.

The requirement of this document is a consequence of one of the findings of the Critical Design Review that expressed a need for a design of the software architecture. The external requirements on the design follow on from the content of the Data Flow Document (VDF-WFA-WSA-005), the Hardware/OS/DBMS Design Document (VDF-WFA-WSA-006), the Database Design Document (VDF-WFA-WSA-007), and the User Interface Document (VDF-WFA-WSA-008). Also relevant is the Interface Control Document (VDF-WFA-WSA-004) which determines the requirements on the format of data to be transfered from CASU to WFAU and into the archive.

OVERVIEW

This document is structured in a top-down approach in terms of levels of complexity.

Section 3, describes the fundamental concepts behind our approach in designing the software architecture and introduce the concept of UML component and deployment diagrams. Section 4, recaps on the hardware to be deployed in the WSA and describe the architecture from an operating system standpoint. Section 5, moves down to the next level of detail and describes the different types of components, such as modules, scripts, etc, that will be deployed on the system architecture. All individual components are listed in this section. In Section 6, details of the wrapper modules identified in Section 5 are given. The implementation details of the curation use cases and how they use these wrapper modules then follow in Section 7.

Readers who wish to see a complete listing of all components to use as a reference can get this from Sections 4-5 and skipping subsequent sections. Those who wish to see full implementation details should also read Sections 6-9. These sections are particularly aimed at those WSA team members who will do the actual coding.

FUNDAMENTALS

Rationale

The software architecture design for the WSA follows on from the work documented previously. Here we briefly recap what has been done so far:

In the Data Flow Document (VDF-WFA-WSA-005), a top level use case analysis for the WSA was presented, where the system's behaviour was described from a users standpoint. This identified who or what uses the system along with a collection of scenarios with which they enact. In the Database Design Document (VDF-WFA-WSA-007), the scenarios performed by the archive scientist were expanded upon and developed into the identification of 20 curation use cases. This type of analysis enables the identifcation of system requirements.
The Data Flow Document also analysed the flow of data from input to the WSA through to the end user. The contents were identified along with the data products offered to the end user. This led to identifying the software that will be involved in these data flows.
Details on how pixel, catalogue, and other data are stored in the WSA were given in the Database Design Document with particular reference to design of database schemas and tables that will be managed by the DBMS.
The User Interface Document (VDF-WFA-WSA-008) described how data will be delivered to the end user and described the underlying software that will be involved.
The Hardware/OS/DBMS Design Document (VDF-WFA-WSA-006) described the hardware on which all of the above will be deployed. The various computer systems and their operating systems were described along details on how they will be networked.

In this document, all this is taken to the next level by presenting detailed designs of the software architecture. Here, it is necessary to identify real software entities (as opposed to abstract modelling concepts), show their inter-dependencies and inter-relationships, and show where they will be deployed within the WSA. In designing the software architecture, we are taking a component oriented view. The goals of this is to enable re-use of individual software components and also to be able to replace existing components with newer, perhaps upgraded, ones with a minimum of disruption to the system as a whole.

Design blueprints are necessary here, so that developers of the WSA can clearly see how all the components work in with each other - and how components are affected if one of them is removed, changed or replaced. The UML component and deployment diagrams provide a means for drawing such block diagrams and it makes sense to adopt this framework here.

An introduction to UML component and deployment diagrams

Component diagrams in the UML depict the inter-relationships amongst software entities in a system through their inter-dependencies and their interfaces. For illustrative purposes, a generic component diagram is given in Fig. 1. There are three modelling elements employed in component diagrams.

A component is a software entity that physically resides in the computer system. These include items such as scripts, executables, source code, database tables etc. Components are not abstract modelling entities like UML classes or entity-relationship models. A component may be labelled with its stereotype which identifies the component with a particular type, the name of the component, and a brief explanation of the role carried out.
An interface is a function/method/subroutine a component offers to other components. It represents a task that the component supporting (or providing) the interface carries out. The relationship between an interface and the component supporting it, is represented by a solid line, as depicted by the interface "lollipop" shown in Fig. 1. An interface is labelled
Dependency relationships between components are denoted by the dashed lines shown in Fig. 1 with the direction of the dependency indicated by the direction of the arrow, i.e. a component depends upon the component pointed to. Dependencies are also used to depict a component using an interface supported by another component as shown in Fig. 1.

The UML deployment diagram shows the configuration and deployment of components on physical hardware devices. An individual device is refered to as a node and is depicted by a 3D box as shown in Fig. 1. As with UML components, stereotypes may be used to denote the type of hardware.

**Figure 1:** Generic UML component diagram and deployment diagram. Component 2 is deployed on the hardware node depicted by the 3D box.
$\includegraphics[bbllx=40pt,bblly=124pt,bburx=540pt,bbury=391pt]{umlintro.ps}$

OS LEVEL COMPONENTS AND DEPLOYMENT

In this section of top level view of the system architecture is given. Here we consider operating systems to be used along with high level enabling systems such as web servers, database mangement systems, and Java containers. These are listed in Table 1. The overall role of these components is to run the computer systems involved and to provide a frame work which enables development of components that required functionality of the WSA. Fig. 2 shows the deployment of the top level components on the hardware described in the Hardware Design Document (VDF-WFA-WSA-006).

As can be seen in Fig. 2, and as described in the Hardware Design Document, the software architecture of the WSA will be deployed on a number of servers.

The Web Server is a PC running Linux which provides the entry point for users to access data products from the WSA. All user interface components will be deployed here.
The Catalogue Server is a PC running under Windows which contains a copy of the latest release of databased products. It is from here that catalogue data products are provided to the user, via the web server.
The Load Server is a PC running under Windows which is used by the archive scientists for managing the WSA.
The Curation Server is a PC running Linux onto which the WSA curation use cases are deployed. This server will run all the software requiring pixel and catalogue data processing involved in generating archived data products as well as handling those tasks requiring connections to the database mangement system deployed on the load and catalogue servers. This server will be one of the servers that form one node of the mass storage system.

Table 1: Operating system level components deployed on the WSA

Component Name	Stereotype	Deployment	Function
Linux	OS	Curation Server, Web Server	operating system
Windows XP	OS	Catalogue Server, Load Server	operating system
MS SQL Server	DBMS	Catalogue Server, Load Server	DB management system
Apache	HTTP server	Web Server	HTTP server
Tomcat	Java code	Web Server	Servelet container classes

**Figure 2:** UML deployment diagram showing the main hardware and operating system components of the WFCAM Science Archive.
$\includegraphics[width=15cm]{deployment.eps}$

WSA SYSTEM COMPONENTS

System Curation Components

The WSA curation use cases, identified in the Database Design Document, are a set of tasks carried out by the archive scientist. These encompass all processes by which data is ingested, processed and rendered in a form suitable for user access. Each of the 20 use cases will be implemented as a Python script and as such each corresponds to a single software component. Those components that will be implemented in Versions 1 and 2 of the WSA are listed in Tables 2. These are developed in detail in Section 7.

Included within the use cases are tasks involving database queries and operations that take place on the Load Server along with image processing and data analysis applications that need to be invoked on a Linux workstation. It was decided to implement all curation use cases on one workstation providing a single point of entry for management of these tasks. As discussed in Section 4, one of the PCs that form a node of the mass storage system will be used as the Curation Server.

A special curation use case is the "discovery" mechanism whereby the archive scientist identifies which curation use cases need to be carried out at a particular time or when a given use case needs to be carried out. Plans for implementing the Curation Discovery Tool will be discussed in Section 8

Table 2: Curation task components for WSA.

Component Name	Version & priority	Stereotype	Deployment	Function
CU1.py	V1 p1	Python script	Curation Server	Fetch data from CASU
CU2.py	V1 p1	Python script	Curation Server	Create library compressed images
CU3.py	V1 p1	Python script	Curation Server	Ingest science and compressed image metadata
CU4.py	V1 p1	Python script	Curation Server	Ingest single frame source detections
CU5.py	V1 p3	Python script	Curation Server	Create library H $_{\rm 2}$ -K difference images
CU6.py	V1 p1	Python script	Curation Server	Create spatial indicies for all new records having celestial coordinates
CU7.py	V1 p1	Python script	Curation Server	Recalibrate photometry
CU8.py	V1 p2	Python script	Curation Server	Create/update merged source catalogues
CU9.py	V1 p3	Python script	Curation Server	Produce list measurements between WFCAM passbands
CU10.py	V2 p2	Python script	Curation Server	Compute/update proper motions
CU11.py	V2 p3	Python script	Curation Server	Recalibrate astrometry
CU12.py	V1 p1	Python script	Curation Server	Get publicly released and/or consortium supplied external catalogues
CU13.py	V1 p2	Python script	Curation Server	Create library stacked/mosaic images
CU14.py	V1 p2	Python script	Curation Server	Create standard source detection list from any new stacked/mosaiced image frame product
CU15.py	V1 p2	Python script	Curation Server	Run periodic curation tasks CU6-CU9
CU16.py	V1 p1	Python script	Curation Server	Create default joins with external catalogues
CU17.py	V2 p2	Python script	Curation Server	Produce list driven measurements between WFCAM and non-WFCAM imaging data
CU18.py	V1 p1	Python script	Curation Server	Create/recreate table indicies
CU19.py	V1 p1	Python script	Curation Server	Verify, freeze, and backup
CU20.py	V1 p1	Python script	Curation Server	Release--place online new DB product

Wrappers

The curation of the WSA involves a wide range of operations performed on the data stored there. Some of these operations require a number of sets of software packages to be installed along with any libraries or additional software packages with which they depend. These include, for example, software developed at CASU to build image stacks and mosaics, and off-the-shelf libraries such as CFITSIO for low level manipulation of FITS files. Other operations involve submitting specific queries to the database or performing some sort of database modification or manipulation. Most of the curation use cases involved some combination of all of these operations.

The analysis of the curation use cases carried out in the Database Design Document, identified specific tasks to be carried out within them. Some of these tasks are carried out by more than one use case. In our design of the software architecture, we aim to present these tasks as a set of well defined interfaces that hide the implementation details behind them. Additionally, we have grouped tasks with certain common properties into a set of wrapper modules. For example, database queries are handled by one module, database modifications are handled by another module, FITS file reading tasks are handled by another, and so on. Each of these will be implemented as Python modules and as such each represents one component with each member interface performing a required task and implemented as Python subroutines.

These wrapper module components are listed in Table 3 along with the underlying software components that are used. The corresponding inter-dependencies along with the interface layer offered to the curation use case cases can be seen in the component diagram given in Fig. 3. A detailed description of each wrapper module along with each of their interfaces is given in Section 6. It is important to note that the curation use case scripts only access the interface layer. By carefully defining "clean" interfaces in this way, any changes that are made to the components behind the interface stay localised. Their effects do not propagate through the entire system.

**Figure 3:** Wrapper modules and their dependencies. The wrappers provide the clean interface layer for the curation tasks.
$\includegraphics[bbllx=23pt,bblly=83pt,bburx=526pt,bbury=771pt]{wrappers.ps}$

Table 3: Wrapper modules and underlying components that provide the clean interface layer to the curation tasks.

Component Name	Stereotype	Deployment	Function
DbRpcServer.py	Python module	Load Server	Provides Db handling services via RPC
DbHandler.py	Python module	Curation Server	Accesses Db services via RPC
QueryHandler.py	Python module	Curation Server	Handles queries to database
FitsReader.py	Python module	Curation Server	Carries out specific FITS file reading operations
CompressHandler.py	Python module	Curation Server	Handles library compression operations
CatExDriver.py	Python module	Curation Server	Driver for source extraction software
DiaDriver.py	Python module	Curation Server	Driver for difference imaging software
StackDriver.py	Python module	Curation Server	Driver for stacking/mosaicing software
PhotoCalibDriver.py	Python module	Curation Server	Driver for software to obtain photometric calibration solutions
DIA Tools	Software package	Curation Server	Suite of programs for difference imaging
CASU Extractor	Software package	Curation Server	CASU supllied source extraction, measurement, and classification software
CASU Stacker	Software package	Curation Server	CASU supplied stacking and mosaicing software
PhotoCal	Software package	Curation Server	Set of C/Fortram programs for photometric calibration solving
Pairing tools	Software package	Load Server	Set of C/Fortran programs for object cross referencing?

User access enabling components

The plans for implementing the user interface and were described in the User Interface Document (VDF-WFA-WSA-008). Two types of access tools are planned for V1: those serving pixel data and those serving catalogue data. The components that need to be developed at WSA are listed in Table 4 for pixel access and in Table 5 for catalogue access.

Table 4: Image serving enabling components

Component Name	Stereotype	Deployment	Function
PF1	HTML file	Web Server	Web form for small image extraction
PF2	HTML file	Web Server	Web form for batch image extraction
PF3	HTML file	Web Server	Web form for large image generation
PF4	HTML file	Web Server	Web form for stacked image generation (V2)
pf1.py	CGI script	Web Server	Extracts and delivers single small image
pf2.py	CGI script	Web Server	Batch mode small image extraction
pf3.py	CGI script	Web Server	Large image extraction and delivery
pf4.py	CGI script	Web Server	Stacked image extraction and delivery
Image Tools	Software package	Web Server	Sub-image extraction from FITS
UiQueryHandler.py	Python module	Web Server	Handles queries to database
StackDriver.py	Python module	Web Server	Driver for stacking/mosaicing software
CASU Stacker	Software package	Web Server	CASU supplied stacking and mosaicing software

Table 5: User interface enabling components

Component Name	Stereotype	Deployment	Function
OF1	HTML file	Web Server	Radial search form
OF2	HTML file	Web Server	Rectangular search form
OF3	HTML file	Web Server	SQL query form
OF4	HTML file	Web Server	Catalogue cross matching form
OF1	HTML file	Web Server	Radial search object form
CatServer	Java Servlet	Catalogue Server	Serves catalog data to user

Off the shelf components

The components to develop at WSA make use of a number of off-the-shelf utilities. While these are not developed at WSA, they need to be deployed there and so should be identified and listed here. This is done so in Table 6.

Table 6: Off the shelf components

Component Name	Stereotype	Deployment	Function
DBI	Module	Curation Server, Web Server	Database independent interface
DBD::Sybase	Module	Curation Server, Web Server	MS SQL database driver
CFITSIO.py	Module	Curation Server, Web Server	Python bindings to cfitsio
cfitsio	Library	Curation Server, Web Server	C library for FITS I/O handlng
ast	Library	Web Server	C library for WCS handlng
JSKY	Java classes	Web server	coordinate conversion
Java.sql	Java classes	Web server	database connectivity
MS JDBC	Java classes	Web server	MS database drivers
PythonMagick.py	Module	Curation Server, Web Server	Python bindings to ImageMagick

WRAPPER MODULES AND THEIR INTERFACES

Query Handler

Name of Component:

QueryHandler.py

Description:

This module provides a set of interface functions that carry out the queries to the database that are required by the curation use cases. The purpose here is to gather all the queries that are required into one module and hide the database implementation details behind the interfaces. This module deals with read-only database queries, ie no writing, modifications, and additions to the database are carried out.

Interfaces:

findNonStacked(program ID, name of stack/mosaic image, list of image filenames)

This formulates and submits the database query to find newly acquired images that have not yet been added to the given stack or mosaic image. The results of the query are parsed and placed in the list of image filenames.

findNonDiffed(program ID, passband ID for 1st image, passband ID for 2nd image)

This formulates and submits the database query to find newly acquired images in the given passbands from where difference images have not yet been constructed.

findWfcamSources(program ID, lower bound RA, upper bound RA, lower bound declination,
upper bound declination, list of sources)

This formulates and submits the database query to find WFCAM sources for the given program that lie within the given bounded region on the sky.

Dependencies:

The implementation details of this module depend upon the database management system that is installed on the load and catalogue servers. Crucially, this module also depends upon the database connectivity mechanisms. We envisaged that this functionality be provided by the Python DBI and DBD::Sybase modules. Again, the implementation details are hidden behind the interfaces. The implementation of the curation use cases only sees the interface layer provided by this module.

Implementation notes

More interfaces will be added to serve the needs of curation use cases CU8, CU9, CU10, CU11, and CU12.

**Figure 4:** Component diagram for the query handler module.
$\includegraphics[bbllx=20pt,bblly=510pt,bburx=372pt,bbury=740pt]{QueryHandler.ps}$

Database Handler

Name of Component:

DbHandler.py

Description:

This modules provides a set of methods for carrying out database update, addition, and modification operations that are required by the curation use cases. In these use cases, a number of very specific database operations are required. The purpose of this module is to provide one clean interface for each of these operations. All the database specific details will be hidden behind these interfaces.

Interfaces:

writeLock()

Write locks the database system deployed on the load system to prevent other tasks from modifying the database.

writeUnlock()

Re-enables write access on the database system on the load server.

updateCurationLog(name of curation use case, program ID, timestamp)

Updates the archive curation log tables in the database for a new occurrence of the given curation use case operation.

loadTransactionData(name of transaction file)

Loads image metadata and/or source catalog data into the database from the named intermediate transaction file.

createNewSpatialIndicies

Calculates spatial indicies for all new records having spatial coordinates.

createPhotoCalibJoin

Creates join tables between WFCAM sources and photometric calibration sources.

updatePhotoCalib

Updates the appropriate database tables with new photometric calibration information

createExternalJoin

Create join tables between WFCAM measurements and external surveys

reCreateTableIndicies

Create or recreate table indices in the database

verifyCuration

Carry out database operation to verify that the latest required curation task has thje most recent timestamp.

freezeSubset

Create world readable subset of program database based on observation date, current release, and propriotory period

release

Invokes operation on catalog server to place new database products online.

attachTableConstraints

dropTableConstraints

**Figure 5:** Component diagram for the database handler module.
$\includegraphics[bbllx=20pt,bblly=200pt,bburx=438pt,bbury=740pt]{DbHandler.ps}$

Dependencies

As with the QueryHandler module, database drivers will be required for this module.

Implementation notes

Many of the database operations involve extracting large amounts of data, performing some type of analysis task, and then re-inserting the results back in the database. In implementing these operations from the curation server side, it was decided that it would be better not to adopt the practice of performing these operations remotely opening a connection to the database as this would involve significant amount of traffic within the network. We are planning to implement these operations entirely on the load server side and to make them available to the curation server as remote procedure calls. Each of the methods defined for the DbHandler module will carry out the appropriate remote procedure calls to services that are running on the load server. It is the load server that takes care of the implementation details for handling the database. The set of functions providing the remote services will be defined in the module DbRpcServer that will run on the load server. Implementation of this module will be given in later issues of this document.

FITS Reader

Name of Component:

FitsReader.py

Description:

This carries out operations required in the curation use cases that involve reading FITS file attributes.

Interfaces:

open(FITS filename)

Opens the given filename for reading. This must be called before using any of the other methods below.

close()

Closes the currently defined filename. This must be called before a new filename is processed.

isImageContainer(HDU ID)

Looks at the FITS keys and attributes in the given HDU and determines whether or not it contains pixel image data. This returns boolean true (numeric 1) if this is the case, otherwise returns boolean false (numeric 0).

isCatalogContainer(HDU ID)

Looks at the FITS keys and attributes in the given HDU and determines whether or not it contains source catalog data. This returns boolean true if this is the case, otherwise returns boolean false.

getProgramID(programID)

Reads program identifier from FITS header and places this in the Python object that represents the program ID.

getImageArrays(list of HDU IDs)

Find all HDUs in the currently defined FITS filename that correspond to image data and place them in a Python array.

getCatalogArrays(list of HDU IDs)

Find all HDUs in the currently defined FITS filename that correspond to source catalogue data and place them in a Python array.

makeTransactionData(HDU ID, Destination transaction file)

Strips all FITS keys attributes and dumps them into an intermediate transaction file for later ingestion into the database. This method will determine the type of contents of the specified HDU ID of the FITS file - that is, whether the array contains image or catalog data.

Dependencies:

This module will require appropriate FITS input utilities. The CFITSIO Python binding module CFITSIO.py will be used. This in turn requires the low level cfitsio library to be deployed.

**Figure 6:** Component diagram for the FITS reader module.
$\includegraphics[bbllx=20pt,bblly=280pt,bburx=423pt,bbury=624pt]{FitsReader.ps}$

Compress Handler

Name of Component:

CompressHandler.py

Description:

Wrapper to handle image compression operations necessary for generating the library compressed images.

Interfaces:

compress(FITS filename, array HDU ID, target filename)

Compresses pixel data in the given HDU of the input FITS array and writes compressed image to the target filename.

Dependencies:

It is intended to store JPEG images for each WFCAM science image. The PythonMagick module provides the functionality required.

**Figure 7:** Component diagram for the image compression handler module.
$\includegraphics[bbllx=20pt,bblly=254pt,bburx=438pt,bbury=625pt]{CompressHandler.ps}$

Catalogue Extraction Driver

Name of Component:

CatExDriver.py

Description:

This module provides a wrapper to the software for source extraction, analysis, measurement and classification based on astronomical images.

Interfaces:

extractSources(image container FITS file, destination catalog FITS file)

Extracts all sources from the input image. The extracted sources are then put through an analysis prescription for flux measurement, classification, etc. The results are placed in the destination FITS file.

measureSources(image container FITS file, list of sources, destination catalog FITS file)

Performs ``list driven'' photometry on the input image. The same source analysis prescription as above is carried out in this case on the input list of source positions on the image.

Dependencies:

This module requires the underlying source extraction extraction software. This will be the set of tools developed at CASU.

**Figure 8:** Component diagram for the source catalogue extraction driver module.
$\includegraphics[bbllx=20pt,bblly=137pt,bburx=370pt,bbury=640pt]{CatExDriver.ps}$

Difference Imaging Driver

Name of Component:

DiaDriver.py

Description:

This module provides a wrapper for the difference imaging software.

Interfaces:

makeDiffImage(first image FITS file, second image FITS file, difference image FITS file)

Generates difference images, HDU by HDU, corresponding of the subtraction of the second images from the first images

Dependencies:

This requires the difference imaging software package.

**Figure 9:** Component diagram for the difference imaging software driver module.
$\includegraphics[bbllx=20pt,bblly=137pt,bburx=360pt,bbury=636pt]{DiaDriver.ps}$

Stack and Mosaic Driver

Name of Component:

StackDriver.py

Description:

Driver module for the software package that builds image stacks and mosaics.

Interfaces:

addToMosaic(target mosaic image filename, list of new image files)

Adds the new images to the target mosaic image. If the target file does not exist, this module will interpret that a new mosaic image is to be created.

addToStack(target mosaic image filename, list of new image files)

Adds the new images to the target stacked image. If the target file does not exist, this module will interpret that a new stacked image is to be created.

Dependencies:

The implementation details of this module depends on the stacking and mosaicing software package that is to be provided by CASU.

**Figure 10:** Component diagram for the image stacking/mosaicing driver module.
$\includegraphics[bbllx=20pt,bblly=137pt,bburx=438pt,bbury=740pt]{StackDriver.ps}$

Photometry Calibration Driver

Name of Component:

PhotoCalibDriver.py

Description:

Driver module for the software package to calculate photometric calibration coefficients.

Interfaces:

solve(input source list, input standard source list, calibration coefficients)

**Figure 11:** Component diagram for photometric calibration software driver module.
$\includegraphics[bbllx=20pt,bblly=270pt,bburx=438pt,bbury=624pt]{PhotoCalibDriver.ps}$

Application Driver

Name of Component:

AppDriver.py

Description:

This module will function as wrapper around the Python system call that will be used to execute applications from the shell. The applications to be run will be software executables. This module will analyse the output of system to determine the exit status of the program and/or whether or not the application crashed with a core dump.

Interfaces:

run(command application name)

**Figure 12:** Component diagram for the application driver module.
$\includegraphics[bbllx=20pt,bblly=510pt,bburx=438pt,bbury=740pt]{AppDriver.ps}$

IMPLEMENTATION OF CURATION USES CASES

In this section, the implementations of the curation use cases are given in "pseudo-script". The purpose is to show the structure of the scripts and how they use the modules, without obscuring this with language specific details. Actual scripting, can follow on directly from the pseudo-scripts.

A script which uses the interfaces offered by a particular module starts with Python-like constructs such as

import DbHandler

A particular task where implementation code is required is preceeded by a dash, as in

- intitiate network connection

Instances of where the script uses an interface provided by a module are denoted as in

DbHandler.updateCurationLog("CU2", programID, timestamp)

CU1: Obtain science data from CASU

import DbHandler
import FitsReader

- intitiate network connection

- look for new source directories

foreach new source directory

    - test directory for read readiness
    - set destination directory
    - get list of filenames in directory
    
    foreach file in directory

        - check for duplicates and reruns
        - transfer file to destination directory
        - verify success of transfer
        - log transfer in transfer log
	
	FitsReader.open(filename)
	programID = FitsReader.getProgramID
	FitsReader.close
        - note programID

- close network connection

- copy transfer log to permanent storage area

foreach programID
    DbHandler.updateCurationLog("CU1", programID, timestamp)

**Figure 13:** Component diagram for curation use case CU1
$\includegraphics[bbllx=60pt,bblly=310pt,bburx=593pt,bbury=737pt]{cu1.ps}$

CU2: Create library compressed image frame products

INPUT - transfer log from CU1

import DbHandler
import FitsReader
import CompressHandler

- open transfer log for reading

foreach filename entry in transfer log

    FitsReader.open(filename)

    FitsReader.getImageArrays(arraylist)
    
    programID = FitsReader.getProgramID
    - note programID

    foreach image array entry in arraylist
    
        - determine target filename
	
        CompressHandler.compress(filename, image array, target filename)
	
        - log newly compressed image

    FitsReader.close

foreach programID	
    DbHandler.updateCurationLog("CU2", programID, timestamp)

**Figure 14:** Component diagram for curation use case CU2
$\includegraphics[bbllx=60pt,bblly=300pt,bburx=593pt,bbury=745pt]{cu2.ps}$

CU3: Ingest details of image products

INPUT - transfer log from CU1
      - compression log from CU2

import DbHandler
import FitsReader

DbHandler.writeLock

- open transfer log for reading

foreach filename entry in transfer log

    FitsReader.open(filename)

    # Need some way to deal with the compressed image filename

    FitsReader.getAllArrays(array list)
    
    foreach arrayID in array list
        - set name of transaction file    
        FitsReader.makeTransactionData(arrayID, transaction file name)
        - note transaction file name

    programID = FitsReader.getProgramID
    - note program ID

    FitsReader.close

foreach transaction filename
    DbHandler.loadTransactionData(filename)
    - delete transaction files

foreach programID
    DbHandler.updateCurationLog("CU3", programID, timestamp)

DbHandler.writeUnlock()

**Figure 15:** Component diagram for curation use case CU3
$\includegraphics[bbllx=60pt,bblly=330pt,bburx=593pt,bbury=745pt]{cu3.ps}$

CU4: Ingest single frame source detections

INPUT - transfer log from CU2

import DbHandler
import FitsReader

DbHandler.writeLock

- open transfer log for reading

foreach filename entry in log file

    FitsReader.open(filename)
    next unless FitsReader.isCatalogContainer

    programID = FitsReader.getProgramID
    - note program ID
    
    FitsReader.getCatalogArrays(array list)
    
    foreach arrayID in catalog arrays
    
        - determine transaction file name
        FitsReader.makeTransactionData(arrayID, transaction file name)
	
    FitsReader.close

foreach transaction file
    DbHandler.loadTransactionData(filename)

foreach programID handled here
    DbHandler.updateCurationLog("CU4", programID, timestamp)

DbHandler.writeUnlock()

**Figure 16:** Component diagram for curation use case CU4
$\includegraphics[bbllx=60pt,bblly=330pt,bburx=593pt,bbury=745pt]{cu4.ps}$

CU5: Create library H2-K difference image frame products

import DbHandler
import QueryHandler
import DiaDriver

DbHandler.writeLock

QueryHandler.findNonDiffed(programID, "H2", "K", image pair list)

foreach pair in image pair list

    - make file name for difference image
    DiaDriver.makeDI(H2filename, Kfilename, diffimagefilename)

    - copy diffimagefilename into flat file system
    DbHandler.insertImage(diffimagefilename)

programID = "LAS"
DbHandler.updateCurationLog("CU5", programID, timestamp)
DbHandler.writeUnlock()

**Figure 17:** Component diagram for curation use case CU5
$\includegraphics[bbllx=60pt,bblly=245pt,bburx=593pt,bbury=774pt]{cu5.ps}$

CU6: Create spatial index attributes

import DbHandler

DbHandler.writeLock

DbHandler.createNewSpatialIndices

# need the program IDs from invoking above 
DbHandler.updateCurationLog("CU6", programID, timestamp)
DbHandler.writeUnlock()

**Figure 18:** Component diagram for curation use case CU6
$\includegraphics[bbllx=60pt,bblly=200pt,bburx=593pt,bbury=630pt]{cu6.ps}$

CU7: Recalibrate photometry

INPUT - programID

import DbHandler
import PhotoCalDriver

DbHandler.writeLock

DbHandler.createPhotoCalibJoin

PhotoCalibDriver.solve

# another sql script invoked on load server
DbHandler.updatePhotoCalib

DbHandler.updateCurationLog("CU7", programID, timestamp)
DbHandler.writeUnlock()

**Figure 19:** Component diagram for curation use case CU7
$\includegraphics[bbllx=60pt,bblly=257pt,bburx=593pt,bbury=745pt]{cu7.ps}$

CU8: Create/update merged source catalogues

INPUT - program ID
      - prescription for merged source catalogues

import DbHandler
import QueryHandler

DbHandler.writeLock()

# Implementation details to go here

DbHandler.updateCurationLog("CU7", programID, timestamp)

DbHandler.writeUnlock()

The implementation details for this use case are still being developed at the time of the first issue of this document. The implementation will require specific queries to the database to get the required images. It is planned that the merging and association algorithms will be implemented as C/C++ code. The resulting executables will then need to be invoked from the script implementing this use case, probably via an interface provided in a wrapper module.

CU9: Produce list measurements between WFCAM passbands

import DbHandler

DbHandler.writeLock()

# Implmentation details to go here

DbHandler.writeUnlock()
DbHandler.updateCurationLog("CU9", programID, timestamp)

The implementation details for this use case are still being developed at the time of the first issue of this document. As with CU8, the implementation of this use case will require queries to the database and will need to invoke C/C++ applications.

CU10: Compute/update proper motions

import DbHandler

DbHandler.writeLock()

# Implementation details to go here

DbHandler.updateCurationLog("CU10", programID, timestamp)
DbHandler.writeUnlock()

The components and interfaces required to implement this use case are not known at the time of the first issue of this document.

CU11: Recalibrate astrometry

DbHandler.writeLock()

# Implementation details to go here

DbHandler.updateCurationLog("CU11", programID, timestamp)
DbHandler.writeUnlock()

The this task is expected to involve a number of interactive procedures. As this is a Version 2, low priority use case, the required components and interfaces are not known at the time of the first issue of this document.

CU12: Get external catalogues

DbHandler.writeLock()

# Implementation details go here

DbHandler.updateCurationLog("CU7", programID, timestamp)
DbHandler.writeUnlock()

The components and interfaces required to implement this use case are not known at the time of the first issue of this document.

CU13: Create library stacked/mosaiced image frame products

INPUT - programID
      
import DbHandler
import StackDriver
import FitsReader

DbHandler.writeLock

- get list of passbands required in this program

foreach passband 

    - get name of target image
    
    QueryHandler.findNonStacked(programID, targetName, list of filenames)
    
    if target is stacked image
       StackDriver.addToStack(targetName, list of filenames)
       
    else if target is mosaic image
       StackDriver.addToMosaic(targetName, list of filenames)
    
    FitsReader.open(target filename)
    FitsReader.getAllArrays(list of arrays)
    
    foreach arrayID in list of arrays
       - set name of transaction file
       FitsReader.makeTransactionData(arrayID, transaction filename)
       DbHandler .loadTransactionData(transaction filename)
       - delete transaction file
   
    FitsReader.close

    DbHandler.insertImage(target filename)
    
DbHandler.updateCurationLog("CU13", programID, timestamp)
DbHandler.writeUnlock()

**Figure 20:** Component diagram for curation use case CU13
$\includegraphics[bbllx=60pt,bblly=115pt,bburx=593pt,bbury=770pt]{cu13.ps}$

CU14: Create standard source detection list for new stack/mosaic

INPUT - name of stack/mosaic image

import DbHandler
import FitsReader
import CatExDriver

DbHandler.writeLock

- set destination catalog container file name
CatExDriver.extractSources(stack/mosaic filename, catalog file name)

- set source transaction filename

FitsReader.open(stack/mosaic image filename)
FitsReader.getImageArrays(list of arrays)

foreach arrayID in list of arrays
    - set source transaction filename
    FitsReader.makeTransactionData(arrayID, transaction filename)
    DbHandler .loadTransactionData(transaction filename)
    - delete transaction file

FitsReader.close

DbHandler.updateCurationLog("CU14", programID, timestamp)
DbHandler.writeUnlock()

**Figure 21:** Component diagram for curation use case CU14
$\includegraphics[bbllx=60pt,bblly=125pt,bburx=593pt,bbury=660pt]{cu14.ps}$

CU15: Run periodic curation tasks CU6 to CU9

import DbHandler

- run CU6
- run CU7
- run CU8
- run CU9

# what is the program ID here?
DbHandler.updateCurationLog("CU15", programID, timestamp)

**Figure 22:** Curation use case CU2
$\includegraphics[bbllx=60pt,bblly=395pt,bburx=593pt,bbury=740pt]{cu15.ps}$

CU16: Create default joins with external catalogs

import DbHandler

DbHandler.writeLock

DbHandler.createExternalJoin

DbHandler.updateCurationLog("CU16", programID, timestamp)
DbHandler.writeUnlock()

**Figure 23:** Component diagram for curation use case CU16
$\includegraphics[bbllx=60pt,bblly=200pt,bburx=593pt,bbury=626pt]{cu16.ps}$

CU17: Produce list driven measurements between WFCAM/non-WFCAM data

import DbHandler
import QueryHandler
import FitsReader
import CatExDriver

- specify program ID

DbHandler.writeLock

foreach image set

    - determine sky coverage of image (ra1, ra2, dec1, dec2)
    
    # Query sources to pass into CatExDriver
    QueryHandler.findWfcamSources(programID, ra1, ra2, dec1, dec2, source list)

    - convert source list positions to pixel coords on image

    - set name of FITS catalog file
    CatExDriver.measureSources(image filename, source list, catalog file name)

    FitsReader.open(catalog file name)
    FitsReader.getCatalogArrays(list of arrays)
    foreach arrayID in list of arrays
        - make transaction filename
	FitsReader.makeTransactionData(arrayID, transaction file)
	DbHandler .loadTransactionData(transaction file)
	- delete transaction file

    FitsReader.close

DbHandler.updateCurationLog("CU17", programID, timestamp)
DbHandler.writeUnlock()

**Figure 24:** Component diagram for curation use case CU17
$\includegraphics[bbllx=60pt,bblly=115pt,bburx=593pt,bbury=770pt]{cu17.ps}$

CU18: Create/recreate table indicies

import DbHandler

DbHandler.writeLock

DbHandler.reCreateTableIndicies

DbHandler.updateCurationLog("CU18", programID, timestamp)
DbHandler.writeUnlock()

**Figure 25:** Component diagram for curation use case CU18
$\includegraphics[bbllx=60pt,bblly=200pt,bburx=593pt,bbury=628pt]{cu18.ps}$

CU19: Verify, freeze, and backup

import DbHandler

DbHandler.writeLock()
DbHandler.verifyCuration
DbHandler.freezeSubset

# done interactively
- backup DB
- copy DB to catalog server

DbHandler.updateCurationLog("CU19", programID, timestamp)
DbHandler.writeUnlock()

**Figure 26:** Component diagram for curation use case CU19
$\includegraphics[bbllx=60pt,bblly=200pt,bburx=593pt,bbury=628pt]{cu19.ps}$

CU20: Release online new database product

import DbHandler

DbHandler.release
DbHandler.updateCurationLog("CU20", programID, timestamp)

**Figure 27:** Component diagram for curation use case CU20
$\includegraphics[bbllx=60pt,bblly=200pt,bburx=593pt,bbury=628pt]{cu20.ps}$

THE CURATION DISCOVERY TOOL

One of the outcomes of the Critical Design Review identified the need for a use case discovery tool. This will carry out a number of tasks including:

view the state of the system at a given time
list the tasks need to be carried out a given time
determine whether or not a given use case needs to be carried out
notify the archive scientist at any time a particular use case needs to be carried out.

It is envisaged that this tool will comprise of a series of executable scipts and background jobs. The implementation details are not known at the time of the first issue of this document. Details will be given in later issues.

USER INTERFACE COMPONENTS

Wrapper modules for pixel serving

To facilitate the pixel serving process, a number of wrapper modules will be deployed. The intention is that they provide clean interfaces to the CGI scripts that

User interface query handler module

Name of Component:

UiQueryHandler.py

Description:

The module is used to handle the queries to the database to find stored information that are need to serve pixel data to the user.

Interfaces:

findImage(program ID, RA, DEC, filename)

Find the name of the file containing the image data, for the given program ID, that encloses the given position on the sky.

findImages(program ID, RA1, RA2, DEC1, DEC2, list of filenames)

For the given program ID, find all filenames containing pixel data that are contained within the specified boundary on the sky.

Dependencies:

The implementation of this module will require the appropriate database drivers.

Image handler module

Name of Component:

ImageHandler.py

Description:

Performs image pre-processing operations that are necessary before serving pixel data to the user. This module performs a similar role as the FitsReader wrapper module used for the curation tasks, but in this case image write operations are performed.

Interfaces:

extractSubImage(filename, HDU ID, RA, DEC, Nxpixels, Nypixels)

Extracts sub-image from pixel data contained in the given HDU ID of the given filename. The sub-image is centred on the given RA and DEC and has the given dimensions. The input FITS filename must have WCS information encoded in the appropriate header.

updateWCS(filename, new WCS information)

Update the WCS information in the FITS header corresponding to the given filename.

attachSources(filename, source list)

Append the list of sources to a new binary extension onto the given FITS file.

Dependencies:

This module will function as a driver to underlying custom software for FITS image manipulation. This will be written in C++ and will require the off-the-shelf cfitsio and Starlink ast libraries.

Other Modules

Some modules deployed on the curation server will also be deployed on the web server. These are the compression handler module (Section 6.4 and the application driver module (Section 6.9.

Pixel serving components

As described in the User Interface Document, there will be four forms presented to the user for the extraction of pixel data to the archive.

PF1: Small image extraction
PF2: Batch small image extraction
PF3: Large image extraction
PF4: Stacked image extraction (Version 2)

Each of these forms will action a corresponding CGI script that will carry out the appropriate operations, including queries to the database, FITS image manipulations etc. In the components diagrams in Figs. 28-31, each of these four forms are depicted in the context of the underlyin components that are involved in these operations.

Other components that will need to be developed are the scripts that generate pages enabling browsable access to the data and the CGI scripts that recieve HTTP requests from remote users. The implementation details of these will be given in later issues of this document.

**Figure 28:** Component diagram for pixel form 1
$\includegraphics[bbllx=60pt,bblly=140pt,bburx=593pt,bbury=736pt]{pf1.ps}$

**Figure 29:** Component diagram for pixel form 2
$\includegraphics[bbllx=60pt,bblly=140pt,bburx=593pt,bbury=736pt]{pf2.ps}$

**Figure 30:** Component diagram for pixel form 3
$\includegraphics[bbllx=60pt,bblly=140pt,bburx=593pt,bbury=736pt]{pf3.ps}$

**Figure 31:** Component diagram for pixel form 4
$\includegraphics[bbllx=60pt,bblly=140pt,bburx=593pt,bbury=736pt]{pf4.ps}$

Catalogue serving components

As described in the User Interface Document, object catalogue data will be served through a Java servlet that runs on the web server. This servlet will use java classes enabling database connectivity. Implementation details will be given in later issues of this document.

SUMMARY

Groups of components

As an guide scheduling the development work, a breakdown of the groups of components that need to be developed is as follows:

Version 1 curation use cases where detailed designs on their implementations have been drawn up. These are CU1, CU2, CU3, CU4, CU5, CU6, CU7, CU13, CU14, CU15, CU16, CU18, CU19, and CU20.
Other Version 1 curation use cases CU8, CU9, and CU12. The required components include the SQL scripts, the code modules, the interfaces, and the implementation scripts themselves.
Version 2 curation use cases CU10, CU11, and CU12. The required components include the SQL scripts, the code modules, the interfaces, and the implementation scripts themselves.
The database wrapper modules, QueryHandler.py and DbHandler.py serving the curation tasks. Components also include the underlying SQL scripts that are required to implement the database tasks carried out by each method interface.
Modules providing image handling facilities to the curation tasks: FitsReader.py and CompressHandler.py.
The software packages supplied by CASU along with the wrapper modules to go around them.
Difference imaging software and the associated wrapper module.
Photometric calibration software and the associated wrapper module.
The application driver module AppDriver.py.
The web forms and CGI scripts that serve pixel data to the end user.
The wrapper module, UiQueryHandler.py, to deal with queries to the database required by the pixel serving tools,
The image utility software and associated driver module required by the pixel serving tools.
The scripts to generate the pages allowing browsable access to the data.
The CGI scripts(s) allowing direct HTTP connections via the remote users own client software.
Components enabling user access to object catalogue data.

Curation task matrix

Table 7: Task matrix showing the wrapper module interfaces and the tasks that use them.

	Curation Use Case
	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20
QueryHandler
$\rightarrow$ findNonStacked													$\bullet$
$\rightarrow$ findNonDiffed					$\bullet$
$\rightarrow$ findWfcamSources																	$\bullet$
DbHandler
$\rightarrow$ writeLock			$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$		$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$
$\rightarrow$ writeUnlock			$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$		$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$
$\rightarrow$ updateCurationLog	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$
$\rightarrow$ loadTransactionData			$\bullet$	$\bullet$	$\bullet$								$\bullet$	$\bullet$			$\bullet$
$\rightarrow$ createNewSpatialIndicies						$\bullet$
$\rightarrow$ createPhotoCalibJoin							$\bullet$
$\rightarrow$ createExternalJoin																$\bullet$
$\rightarrow$ reCreateTableIndicies																		$\bullet$
$\rightarrow$ verifyCuration																			$\bullet$
$\rightarrow$ freezeSubset																			$\bullet$
$\rightarrow$ releaseProduct																				$\bullet$
$\rightarrow$ attachTableConstraints
$\rightarrow$ dropTableConstraints
FitsReader
$\rightarrow$ open	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$								$\bullet$	$\bullet$			$\bullet$
$\rightarrow$ close	$\bullet$	$\bullet$	$\bullet$	$\bullet$	$\bullet$								$\bullet$	$\bullet$			$\bullet$
$\rightarrow$ isImageContainer
$\rightarrow$ isCatalogContainer
$\rightarrow$ getProgramID	$\bullet$	$\bullet$	$\bullet$	$\bullet$
$\rightarrow$ getImageArrays		$\bullet$	$\bullet$										$\bullet$	$\bullet$
$\rightarrow$ getCatalogArrays				$\bullet$													$\bullet$
$\rightarrow$ makeTransactionData			$\bullet$	$\bullet$	$\bullet$								$\bullet$	$\bullet$			$\bullet$
CompressHandler
$\rightarrow$ compress
CatExDriver
$\rightarrow$ extractSources														$\bullet$
$\rightarrow$ measureSources																	$\bullet$
DiaDriver
$\rightarrow$ makeDiffImage					$\bullet$
StackDriver
$\rightarrow$ addToMosaic													$\bullet$
$\rightarrow$ addToStack													$\bullet$
PhotoCalibDriver
$\rightarrow$ solve							$\bullet$
Curation use case 6															$\bullet$
Curation use case 7															$\bullet$
Curation use case 8															$\bullet$
Curation use case 9															$\bullet$

APPENDICES

ACRONYMS & ABBREVIATIONS

ADnn : Applicable Document No nn
CASU : Cambridge Astronomical Survey Unit
FITS : Flexible Image Transport System
HDU : Header-Data Unit
UML : Unified Modelling Language
VISTA: Visible and Infrared Survey Telescope for Astronomy
VPO : VISTA Project Office
WFAU : Wide Field Astronomy Unit (Edinburgh)

APPLICABLE DOCUMENTS

AD01	Data Flow	VDF-WFA-WFCAM-005 Issue: 1.0 02/04/03
AD02	Hardware/OS/DBMS	VDF-WFA-WFCAM-006 Issue: 1.0 02/04/03
AD03	Database Design	VDF-WFA-WFCAM-007 Issue: 1.0 02/04/03
AD04	User Interface	VDF-WFA-WFCAM-008 Issue: 1.0 02/04/03

CHANGE RECORD

Issue	Date	Section(s) Affected	Description of Change/Change Request Reference/Remarks
1.0	28/05/03	All	New document

NOTIFICATION LIST

The following people should be notified by email whenever a new version of this document has been issued:

WFAU:	P Williams, N Hambly
CASU:	M Irwin, J Lewis
QMUL:	J Emerson
ATC:	M. Stewart
JAC:	A. Adamson
UKIDSS:	S. Warren, A. Lawrence

__oOo__

About this document ...

This document was generated using the LaTeX2HTML translator Version 2K.1beta (1.47)

The command line arguments were:
latex2html -split 0 sadd

The translation was initiated by Ian Bond on 2003-07-22

Ian Bond 2003-07-22