Package wsatools :: Module SourceMerger :: Class SourceMerger
[hide private]

Class SourceMerger

source code


Merges detections over all passbands into a binary source table.

Nested Classes [hide private]
    Nested Errors and Exceptions

Inherited from DbConnect.CuSession.CuSession: CuError

Instance Methods [hide private]
 
_onRun(self)
Initialises member variables required for source merging.
source code
int
deprecateOldFrameSets(self)
Deprecate frame sets that include one or more deprecated frames.
source code
str, str, bool
mergeSources(self, newFrameSets, matchRadius)
Merge source detections from different passbands to create the source table for ingest.
source code
 
reseamSources(self)
Reseam the merged source list for a given programme.
source code
 
_addSeamingIndices(self)
To optimise seaming we need to apply an index to the Source table frameSetID, ra, dec columns (the CU7 index), which are outgested, as well as a covering index on the ppErrBits and seqNum attributes that are queried many times (the CU7_new index).
source code
defaultdict(int:list(int))
_analyseSeams(self, centralSetID, adjacentFrames, fileNames, opticalAxes)
Analyse a set of merged sources in an overlap region.
source code
dict(int, float)
_createEpochDict(self)
Returns: A dictionary of frame epochs referenced by multiframeID.
source code
dict(int, float)
_createPixScaleDict(self)
Returns: A dictionary of frame pixel scales referenced by multiframeID.
source code
list(str)
_getCameoPassbandData(self, pb)
Returns the cameo schema for this passband.
source code
list(tuple)
_getExistingFrameSets(self)
Returns: All frame sets in MergeLog for this programme.
source code
list(tuples)
_getFrameSetSources(self, frameSetID)
Get merged source data from a given frame set for the purposes of reseaming.
source code
float
_getMedianImageExtent(self)
Get the median image extent for all frame sets defined in a particular merge log.
source code
int
_getNextID(self, attrName, tableName, where='', addProgID=False)
Obtains next available ID number in the db for a given ID attribute.
source code
 
_getPositionColumnData(self)
Prepare info for C modules about the detection position column data from the table schema.
source code
list(tuple(int, int))
_getProgFilters(self)
Returns: List of filter IDs and number of passes for the programme.
source code
 
_getSrcPositionColumnData(self)
Define source position column data of a source table outgest for reseaming stage.
source code
bool
_isQualityFlaggingComplete(self)
Returns: True if quality bit flags have been fully updated for this programme's detection table.
source code
int
_pickBestDuplicate(self, overlapSources, overlapAxes, overlapFrameIDs, overlapDetectNos)
Chooses the best source from a set of duplicates in an overlap region.
source code
int
_removeDeprecatedSources(self, tableName)
Remove merged sources that are based on deprecated frame sets.
source code
 
_setProgPassbandDetails(self)
Initialisation routine for this programme's merge-log and passband data.
source code
 
_setProgTableNames(self)
Initialises table name member variables for current programme.
source code
 
_updateAebvCSVFile(self)
Put the latest aebv values from table Filter into aebv.csv for mkmerge.
source code

Inherited from DbConnect.CuSession.CuSession: __del__, __init__, logBrokenFile, run

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Static Methods [hide private]

Inherited from DbConnect.CuSession.CuSession: logException

Class Variables [hide private]
  commencementMessage = 'Source merging... Sit back and relax. P...
Message to log at the commencement of source merging.
  freeCacheInterval = 1000
Number of frame sets between calling FREEPROCCACHE during reseaming.
  minEpochSep = 183
Minimum epoch separation in days before a proper motion solution is worthwhile calculating.
  seamErrorBitMask = 4294967040
Error bit mask for removing innocuous information bits from error bit masks prior to seaming.
  seamingPairCriterion = 0.8
The match radius for associating the same source in overlap regions between frame sets in all mosaiced surveys.
  swVersion = 1.0
The merge software version.
  _autoCommit = True
Should this curation task auto-commit database transactions?
  _isPersistent = True
Should this curation task try to re-open broken database connections?
  _isVerbose = True
Turns off logging for certain sub-classes only.
  _useWorkDir = True
Create a temporary work directory.
  _useExtNum = True
Use extNum in detection table selection
  continueCompleteReseam = False
Should only be used to continue reseaming a source table from scratch following a crash.
  dateRange = DateRange(begin=<mx.DateTime.DateTime object for '...
Range of observation dates to merge.
  doCompleteReseam = False
Reset the merge-log to mark all frame sets as new, and reseam the entire source table.
  _extractor = None
Name of source catalogue extraction tool used to create this programme's detections.
  _filterPasses = None
A list of filterIDs and number of passes for this programme.
  _frameSetIDName = ''
Name of the column in the source table that identifies the unique ID of the frame set for a given source.
  _idxInfo = None
A dictionary of schema.Index object lists referenced by table name.
  _mergeLog = ''
Name of the merge log table for this programme.
  _mfidIndices = None
List of column indices in the merge log table that give the multiframeIDs to each passband's image.
  _newFrameSetIdx = 0
The column index in the merge log table that contains the newFrameSet attribute.
  _numFilterCol = 0
The number of columns in the merge log that contain passband-specific attribute values, for a given passband.
  _numHeaderCol = 0
The number of columns in the merge log table before the repeating set of columns for passband-specific attributes begin.
  _outgester = None
Outgester object for outgesting detections and sources for reseaming.
  _progSchema = None
Dictionary of programme's table schema referenced by table name.
  _passbands = None
List of short-names for passbands of this programme, e.g.
  _sourceIDName = ''
Name of the column in the source table for the unique source ID.
  _sourceTable = ''
Name of the table to ingest merged sources for this programme.

Inherited from DbConnect.CuSession.CuSession: archive, comment, cuEventID, cuNum, curator, eTypes, isDayStampedLog, onlyNonSurveys, onlySurveys, programme, programmeID, resultsFilePathName, shareFileID, sysc

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

_onRun(self)

source code 

Initialises member variables required for source merging.

Overrides: DbConnect.CuSession.CuSession._onRun

deprecateOldFrameSets(self)

source code 

Deprecate frame sets that include one or more deprecated frames.

Returns: int
Number of frame sets deprecated.

mergeSources(self, newFrameSets, matchRadius)

source code 

Merge source detections from different passbands to create the source table for ingest.

Method: For every detector frame set: 1) Obtain the detection list in each passband, bulk-outgest of binary numerical data into the curation-client/load-server share file system; 2) Create the sets of pair pointer files, forwards and backwards, for each distinct pair of passbands; 3) Create the set of merged pointers based on "hand-shake" pairing between any pair of pointer sets (i.e. ensuring consistency of pair pointers between the passbands); 4) Create the merged source list according to the schema prescription, appending new sources to an accumulating binary file for ingest back into the database.

Low-level C/C++ application codes are extensively used to do the heavy CPU/IO work.

Parameters:
  • newFrameSets (list(self.FrameSet)) - List of frame sets to be merged.
  • matchRadius (float) - Pair criterion in arcseconds.
Returns: str, str, bool
Pair of file names for the binary source table and the CSV formatted merge-log table for ingest. Also returns a status to say whether the source merging was complete or not.

reseamSources(self)

source code 

Reseam the merged source list for a given programme. Where the same sources have been observed over multiple frame sets, this routine identifies the best source candidate in overlap regions that produce non-unique source pairs. The priOrSec flag is set to the framesetID, with the best source so primary sources are (priOrSec = 0 OR priOrSec=framesetID).

Method: identifies every current frame set for the programme that is new or has a new adjacent frame; for old current frame sets that have a new adjacent frame, resets the seam flags to default (ie. not seamed) for safety; for every central frame / adjacent frames group, creates lists of sources with attribute subselection required for determination of seamless list; produces pair pointers between the central frame sources and those in adjacent frames; for every central frame record chooses the best source on the basis of a seaming algorithm which incorporates such considerations as distance from optical axis, pixel processing quality considerations etc.; sets all central frame set sources to seam = 0 (i.e. inclusive); and finally resets all central frame set sources that have duplicates in adjacent frame sets to have a seam flag that either points to the central frame or the adjacent frame, depending on which contains the source that is to be considered the best.

_analyseSeams(self, centralSetID, adjacentFrames, fileNames, opticalAxes)

source code 

Analyse a set of merged sources in an overlap region. Sources that are paired between a central frame set and it's adjacent frame sets. The required seaming data are first extracted from the source table; then the function examines each central frame set merged source in turn, checking the adjacent frame set pairs and noting any identical record that should be used in preference to the central record. The function returns a data object containing lists of sources for each frame set ID (central and adjacents) that need their seam index attribute set to indicate the presence of a secondary identical source or a preferred primary source.

Parameters:
  • centralSetID (int) - The UID of the central fame set.
  • adjacentFrames (list(tuple)) - The merge log rows of all adjacent frames.
  • fileNames (list(str)) - Filenames of outgest source and pointer files.
  • opticalAxes (list((float, float))) - Celestial coordinate pairs of the camera optical axis.
Returns: defaultdict(int:list(int))
A duplicate sources dictionary; consisting of a list of source IDs referenced by a certain frame set ID, for which every source in the list must have their seaming flag value set to.

To Do: Could speed things up by looking at function calls within loops consider passing entire lists to functions instead.

_createEpochDict(self)

source code 
Returns: dict(int, float)
A dictionary of frame epochs referenced by multiframeID.

_createPixScaleDict(self)

source code 
Returns: dict(int, float)
A dictionary of frame pixel scales referenced by multiframeID.

_getCameoPassbandData(self, pb)

source code 

Returns the cameo schema for this passband. The schema describes the contents of the individual detection records, i.e. the positions of quantities to be merged and their units. NB: if the base schema for the detection tables is changed, then this function may need to be updated to reflect those changes.

Parameters:
  • pb (str) - Short name of passband.
Returns: list(str)
Cameo schema attributes for this passband: Name, first byte on the record, SQL data type, and units.

_getExistingFrameSets(self)

source code 
Returns: list(tuple)
All frame sets in MergeLog for this programme.

_getFrameSetSources(self, frameSetID)

source code 

Get merged source data from a given frame set for the purposes of reseaming. The function makes a selection of attributes from those available for each merged source; those attributes are for use by the seaming algorithm to decide which is the best record amongst a set of duplicates. This function also returns two scalar values which encode the offset and number of attributes per passband as dictated by the selection.

Parameters:
  • frameSetID (int) - The UID of the source frame set.
Returns: list(tuples)
Attributes for the set of sources in the specified frame set.

_getMedianImageExtent(self)

source code 

Get the median image extent for all frame sets defined in a particular merge log. The idea is to use this is a measurement to define the adjacent frame set tolerance when seaming overlap regions, rather than using the Programme attribute (or indeed a hardwired-value) to generalise the seaming procedure.

Returns: float
The median image extent, in degrees.

_getNextID(self, attrName, tableName, where='', addProgID=False)

source code 

Obtains next available ID number in the db for a given ID attribute.

Parameters:
  • attrName (str) - Name of ID attribute.
  • tableName (str) - Name of the table where attribute resides.
  • where (str) - Optional SQL WHERE clause.
  • addProgID (bool) - If True, prefix initial ID with the programme UID to make ID unique across the entire archive.
Returns: int
Next available ID number.
Overrides: DbConnect.CuSession.CuSession._getNextID

_getProgFilters(self)

source code 
Returns: list(tuple(int, int))
List of filter IDs and number of passes for the programme.

_getSrcPositionColumnData(self)

source code 

Define source position column data of a source table outgest for reseaming stage. Currently, RA and Dec are all that are required for the overlap pairing stage and the analysis is done later in Python by executing a query on the DB to get all required analysis merged source data.

_isQualityFlaggingComplete(self)

source code 
Returns: bool
True if quality bit flags have been fully updated for this programme's detection table.

_pickBestDuplicate(self, overlapSources, overlapAxes, overlapFrameIDs, overlapDetectNos)

source code 

Chooses the best source from a set of duplicates in an overlap region. Given a list of duplicate sources with positions and qualities, and a list of their respective image optical axis points, choose the best image out of the set of duplicates. Originally, this was based on a seaming algorithm that took into account angular distance from the axis until the precise meaning of the bit-wise quality flags was clarified, and used proximity to the optical axis only as the criterion for judging the best source, choosing only from those sources having the maximum number of constituent detections. For DR2 et seq., the seaming algorithm has been modified to: a) mask off innocuous "information"-only error bits amongst the available passbands; b) eliminate any duplicate that has "warning" quality bits set but bringing all back into contention if this would result in no primary image; c) check the filter complement, choosing the merged source with the most filter coverage; d) then, and only then, use proximity to the frame set centre if there is still more than one potential primary; the tangent point for this test is, therefore, the centre of the frame set and not the optical axis.

Parameters:
  • overlapSources (list(tuple)) - Merged source data for the duplicates.
  • overlapAxes (list((float, float))) - Tangent point for each source.
  • overlapFrameIDs (list(int)) - UIDs for the frame sets of the sources.
  • overlapDetectNos (list(int)) - Counts of the no. of detections in each frame.
Returns: int
The UID of the frame set which contains the best source.

_removeDeprecatedSources(self, tableName)

source code 

Remove merged sources that are based on deprecated frame sets.

Parameters:
  • tableName (str) - Source table to edit.
Returns: int
Number of sources deleted.

_setProgPassbandDetails(self)

source code 

Initialisation routine for this programme's merge-log and passband data.

To Do: This will be a *lot* simpler when I've implemented the FrameSetView class.

_updateAebvCSVFile(self)

source code 

Put the latest aebv values from table Filter into aebv.csv for mkmerge.

To Do: Make this method a fundamental part of a new WfcamsrcInterface class, to replace the WfcamsrcInterface module?


Class Variable Details [hide private]

commencementMessage

Message to log at the commencement of source merging.

Value:
'Source merging... Sit back and relax. Progress estimates are for this\
 stage only (ingest and seaming are still to come).'

seamingPairCriterion

The match radius for associating the same source in overlap regions between frame sets in all mosaiced surveys. This parameter is passed directly to the C pairing applications, and is an angular radius expressed in arcseconds. Originally (e.g. EDR) this was set to 1.0 arcsec; discussions with SJW etc. suggest this was somewhat overly generous so it has been reduced to sqrt(2.0) times the typical worst centroiding error, which is equal to about 2pix, or 0.8 arcsec. At the same time, duplicate detection in the seaming algorithm has been relaxed to not insist on the same passband set for duplicates.

Value:
0.8

swVersion

The merge software version. Keeps track of any changes after deployment; if changes to this module (or any module used by this one) require a complete remerging of all merged source tables in the archive, then increment this merge software version number. NB: use with caution, since the resulting catalogue processing will take a long time if the archive is large.

Value:
1.0

continueCompleteReseam

Should only be used to continue reseaming a source table from scratch following a crash. This marks frame sets as old whilst reseaming, without the safety catch of resetting all frame sets to new initially.

Value:
False

dateRange

Range of observation dates to merge. Use begin dates with caution, as for real runs you always need this to be the early default value.

Value:
DateRange(begin=<mx.DateTime.DateTime object for '1753-01-01 00:00:00.\
00' at 2ba0db0>, end=<mx.DateTime.DateTime object for '9999-12-31 00:0\
0:00.00' at 28b8cd8>)

doCompleteReseam

Reset the merge-log to mark all frame sets as new, and reseam the entire source table. This allows us to mark frame sets as old whilst we are reseaming, which is unsafe if we were to be reseaming partial data.

Value:
False

_passbands

List of short-names for passbands of this programme, e.g. k_2.

Value:
None