Package wsatools :: Package DbConnect :: Module DbSession
[hide private]

Module DbSession

source code

Database interface. The DbSession class provides a complete database API as well as connection management features. A database connection is instantiated upon construction of a DbSession object and then automatically closed when the object falls out of scope. Existing identical connections are reused, invisibly to the user, to reduce the number of simultaneous open connections. All access to the database should then be via the methods supplied by the DbSession class.

Usage

Database API

Import as:

   from wsatools.DbConnect.DbSession import DbSession

For a simple default database connection to the WSA:

   db = DbSession("WSA")
   db.update("Multiframe", "deprecated=0", where="multiframeID")
   print(db.queryNumRows("Multiframe"))
   print(db.queryAttrMax("multiframeID", "Multiframe"))

For joined queries, e.g. a list of all LAS stacks multiframeIDs, use the Join class:

   from wsatools.DbConnect.DbSession import Join

   mfIDs = db.query("Multiframe.multiframeID",
            table=Join(["Multiframe", "ProgrammeFrame"], ["multiframeID"]),
            where="frameType LIKE '%stack' AND programmeID=101")

For queries where a select statement is used within a where clause, use the SelectSQL class:

   from wsatools.DbConnect.DbSession import SelectSQL

   progMfs = SelectSQL("multiframeID", table="ProgrammeFrame",
                       where="programmeID=%s" % self.programmeID)
   fileNames = db.query("fileName", table="Multiframe",
                        where="multiframeID IN %s" % progMfs)

To perform a safe temporary disconnect from the database invoke the DbSession.goOffline() method. When any DbSession object that uses this connection next attempts to access the database, e.g. a query, the connection will be reactivated, and if the database is down, then it will persistently attempt to reconnect. Hence, you can also use this method at sensitive stages of a curation script to ensure the database connection remains alive.

Please refer to the DbSession class documentation for the full range of API methods and options.

Ingester

Import as:

   from wsatools.DbConnect.DbSession import Ingester

Before ingesting it is vital to ensure the schema of the ingest file matches the table into which it will be ingesting. This occurs when an Ingester object is created, so this should be done as early as possible in your code, and not within a For-Loop unnecesarily. To instantiate an Ingester object you must pass it a DbSession database connection to the database that hosts the table into which the ingest will occur, and a schema for the table(s) that will be ingested into during the lifetime of this particular Ingester object:

   db = DbSession()
   ingester = Ingester(db, tableSchema)

The tableSchema is just a list of Schema.Table objects for every table that will be ingested this session. Use the Schema.parseTables() method to create this schema, and then reduce or expand the list until just the exact set of tables to be ingest are listed. To ingest a native binary file located in the catalogue server's share path into the table:

   numRows = ingester.ingestTable("lasSource", "lasSource.nat", idxInfo)

Afterward the ingest file is deleted. The idxInfo is created by Schema.parseIndices() and is required if you wish to drop indices for ingest, to automatically recreate them afterward. If the database table doesn't already exist the ingester will create the table.

Please refer to the Ingester class documentation for further options.

Outgester

Import as:

   from wsatools.DbConnect.DbSession import Outgester, SelectSQL

To initialise an Outgester object you must supply it with a connection to the database where the outgest will occur:

   db = DbSession()
   sourceOutgester = Outgester(db)

Then to outgest the results of a particular table query to a binary file:

   filePathName = sourceOutgester.outgestQuery(SelectSQL(
     select="*", table="WSA.dbo.Multiframe", where="deprecated = 0"))

By default the file is outgested to the load server share path, and there is a delay following outgest to allow for the NFS to see the shared file. If an alternative path is given outside of the share path then there is no delay to wait for the NFS to see the file.

Please refer to the Outgester class documentation for further options.

Creating and modifying databases

The Database class provides methods to create and modify databases.

Import as:

   from   wsatools.DbConnect.DbSession import Database, DbSession
   from   wsatools.SystemConstants     import WsaConstants

To recreate the WSA with standard options on server "pharaoh":

   WSA = Database(WsaConstants.loadDatabase,
                  filegroups=[("Detection", "4 GB"),
                              ("Source", "4 GB"),
                              ("Curation", "4 GB"),
                              ("Indices", "4 GB")])
   WSA.create(DbSession("pharaoh."+WsaConstants.adminDatabase))

Please refer to the Database class documentation for further options.


Author: R.S. Collins

Organization: WFAU, IfA, University of Edinburgh

To Do:
Classes [hide private]
  DbSession
An interface to our database.
  Ingester
A special type of database session, where you wish to ingest data into a database table from a native binary or CSV file.
  Outgester
A special type of database session, where you wish to bulk outgest data from the database to a file.
  Database
Defines a database object, and provides method to create and modify databases through administrative database connections.
  Join
Defines a database inner join between two or more tables.
  LeftJoin
Defines a database left outer join between two tables.
  PkLeftJoin
Defines a database left outer join between two tables on the first table's primary key.
  SelectSQL
Defines an SQL select statement.
Functions [hide private]
int
bulkCopy(query, tableSchema, fromDb, toDb, fileTag, drive='', isSmallTable=True)
Copies table data via a bulk outgest and ingest to another table, which can be in a different database and on a different server.
source code
Variables [hide private]
  __package__ = 'wsatools.DbConnect'
Function Details [hide private]

bulkCopy(query, tableSchema, fromDb, toDb, fileTag, drive='', isSmallTable=True)

source code 

Copies table data via a bulk outgest and ingest to another table, which can be in a different database and on a different server. Destination table will be automatically created if it doesn't already exist.

Parameters:
  • query (SelectSQL) - Query of data selected to copy via outgest. Table names must be given with full path. Selection must be ordered by the ingest table's primary key in ascending order.
  • tableSchema (schema.Table) - Schema of the ingest table.
  • fromDb (DbSession) - Connection to the database where outgest will occur.
  • toDb (DbSession) - Connection to the database where ingest will occur.
  • fileTag (str) - Unique tag to attach to temporary outgest/ingest files.
  • drive (str) - Optionally outgest to a different catalogue server drive, instead of the share directory, e.g. "G:".
  • isSmallTable (bool) - If False, then stdout is redirected to file to work around a mystery bug in very large table outgests, which is inefficient for small tables.
Returns: int
Number of rows ingested.

Warning: Make sure the ingest table schema has been checked first.

To Do: Merge Outgester and Ingester classes into one BulkTransfer class with individual outgest, ingest and copy functions? Then we can handle the tableSchema initialisation a bit more neatly, and put fileTag into the initialiser. Then codes can choose whether or not to pre-initialise.