Home | Trees | Indices | Help |
|
---|
|
General utility functions. Mostly concerning manipulation of Python objects, and the file system.
Author: I.A. Bond
Organization: WFAU, IfA, University of Edinburgh
Contributors: R.S. Collins, N.J.G. Cross, N.C. Hambly, E. Sutorius
|
|||
ParsedFile Behaves like a file object, except that when iterating over file lines only non-blank, non-comment lines are returned and any EOL characters are removed together with trailing white-space. |
|||
WordWrapper Formats long strings so that they neatly fit within a certain width without words being split across lines. |
|||
Ratings Ratings is mostly like a dictionary, with extra features: the value corresponding to each key is the 'score' for that key, and all keys are ranked in terms their scores. |
|
|||
int |
|
||
list |
|
||
|
|||
str |
|
||
generator(str) |
|
||
list(list(dataType)) |
|
||
dict(str:list(int, int)) |
|
||
set |
|
||
list(int) |
|
||
str |
|
||
int |
|
||
int |
|
||
list(tuple(int, int)) |
|
||
str |
|
||
str |
|
||
dict |
|
||
mx.DateTime |
|
||
str |
|
||
str |
|
||
bool |
|
||
str |
|
||
|
|||
list(list) |
|
||
str |
|
||
list(X) |
|
||
float |
|
||
generator(list(X)) |
|
||
generator(X) |
|
||
|
|||
|
|
|||
__package__ =
|
|
Private helper function used by the WordWrapper class.
|
Arbitrarily sorts a list of tuples of form (keyword, value) by the order defined in the sequence of specified keywords. Example: >>> arbSort([7, 3, 8, 0, 1], [8, 3], isFullKeySet=False) [8, 3, 7, 0, 1]
To Do: See if defining my own compare function that calls index() is faster/simpler than the Decorate-Sort-Undecorate method used here. |
If the supplied directory does not exist then create it.
|
Given a human readable compact number range string, expand it to a complete sequence of numbers in a CSV string. Example: >>> expandNumberRange(numberRange([1, 2, 3, 5, 6, 7])) '1,2,3,5,6,7' >>> expandNumberRange('1,2') '1,2' >>> expandNumberRange('0') '0'
|
Gobble all entries in the given column of a space separated text file into a list. Lines that begin with the hash mark are treated as comments and are ignored. Example: >>> column = extractColumn("/disk47/sys/test/Utilities/test.cat", 6) >>> list(column)[:2] ['14.5723', '15.1406']
To Do: Replace with extractColumns()? Could leave this method here for speed and simplicity. Though normally we want more than one column anyway! So, may just extractColumns(file, 3)[0] isn't so bad. |
Extracts from the given file the data in the given list of columns as a list of strings for every column. Example: >>> data = extractColumns("/disk47/sys/test/Utilities/test.cat", ... columnList=[6, 7], dataType=float) >>> print(data[0][0], data[0][1], data[1][0], data[1][1]) 14.5723 15.1406 0.0033 0.005
To Do: Possibly alter to make use of the CSV module's abilities to handle different dialects? Would simplify code a bit, and make more useful. |
Gets the available disk space for supplied list of disks.
|
Returns the set of items for which the groupBy method returns the same item more than once. By default, this will simply return just the values in the list that are duplicated. Removed groupBy key option, because it's better to pre-process the iterable once prior to passing to this function and sorting (in this usage of groupby, in other usages it's useful). Examples: >>> getDuplicates([1, 2, 1, 0]) set([1]) >>> getDuplicates(x[0] for x in [(1,2), (3,4), (1,4)]) set([1])
|
Returns a list of all indices where a value occurs in the given list. Example: >>> getListIndices(["bob", "steve", "bob"], "bob") [0, 2]
Note: Can probably avoid the need to use this function by employing a slightly different algorithm design, e.g. use a dictionary. |
Gets the next available disk which is less than 99% full.
To Do: Instead of taking a SystemConstants object, why not make this a method of SystemConstants? |
Calculates the number of items expressed by a human-readable string of number ranges. >>> getNumberItems(numberRange(range(10))) 10
Note: This isn't a sensible way of doing things, as numberRange() is designed only for the purpose of printing human readable strings. It shouldn't be used as a data container for processing. |
|
Taking an ordered list of counts of a particular key, e.g. the results of an SQL "SELECT key, count(*) ... GROUP BY key ORDER BY key", this function returns a list of key ranges that contain up to the specified group size of counts. Example: >>> groupByCounts([(10001, 5), (10002, 3), (10003, 12), (10004, 6)], ... groupSize=10) [(10001, 10002), (10003, 10003), (10004, 10004)]
|
Like string.join, but operates on the contents of a dictionary instead of a list. Joins dictionary keyword and value pairs into the string: str(keyword) + joinStr + str(value) + sepStr etc. Example: >>> joinDict(dict(a=1, b=2)) 'a = 1, b = 2'
|
Like string.join, but can handle nested (or un-nested) sequences of string- castable objects. Example: >>> joinNested([['a', 0], ['b', 1]]) 'a, 0, b, 1'
|
Inverts the dictionary in such a way that if the input dictionary's values are lists, each item of this list will become a key with the input dictionary's key as value. If several input dictionary's keys exist for one input dictionary's value the inverted dict's values will be lists. Example: >>> invertDict(dict(males=["bob", "chris"], females=["jane", "chris"])) {'chris': ['males', 'females'], 'jane': ['females'], 'bob': ['males']}
|
Returns an archive date/time data type, defaulting to the current time if no input argument is given. This defines the archive date/time data type, and is presently set to the mx.DateTime defined type. This function defines the time system for the archive (which is UTC).
|
Creates a timestamp using makeTimeStamp and formats appropriately for use in ingest strings for Microsoft SQL Server (and handles a bug in the datetime object creation).
|
|
Generator test function is equivalent to memory hog
>>> moreThanOneIn(x for x in []) False >>> moreThanOneIn(x for x in ['g']) False >>> moreThanOneIn(x for x in [0.1 ,0.2]) True >>> moreThanOneIn(x for x in [1, 2, 3]) True
|
Performs multiple string substitutions on the given string. Example: >>> multiSub("UKIRT and the WSA", [("WSA", "VSA"), ("UKIRT", "VISTA")]) 'VISTA and the VSA'
|
Disables keyboard interrupts whilst in this context.
|
Divide aList in nMax sublists by populating it with items subsequently taken from the top and the bottom of list. Example: >>> npop([1, 2, 3, 4, 5]) [[1, 5], [2, 4], [3]] >>> npop([1, 2, 3, 4, 5, 6], nMax=3) [[1, 6, 2, 5], [3, 4]]
|
Given a sequence of integers it returns that sequence as a string representation of an ordered range of unique numbers. Example: >>> numberRange([1, 2, 3, 5, 6, 7]) '1-3, 5-7' >>> numberRange([1, 3, 5, 7]) '1, 3, 5, 7' >>> numberRange([]) 'None'
|
Returns a list of the given sequence in the original order, but with duplicates removed. Example: >>> orderedSet([6, 4, 7, 4, 9, 1, 7]) [6, 4, 7, 9, 1]
|
Parses a string value and converts to float. NaNs and failures all return None. Example: >>> parseFloat('3.1') 3.1
|
Splits a list into a list of equal sized chunks. If list is not equally divisable then the last chunk just contains the remaining number of elements. Example: >>> list(splitList([1, 2, 3, 4])) [[1, 2], [3, 4]] >>> list(splitList([1, 2, 3, 4, 5, 6, 7], 3)) [[1, 2, 3], [4, 5, 6], [7]] >>> list(splitList([1, 2, 3, 4, 5, 6, 7], 3, noSingles=True)) [[1, 2, 3], [4, 5, 6, 7]] >>> list(splitList([1])) [[1]] >>> list(splitList([1], noSingles=True)) [] >>> list(splitList([])) []
|
Given a list of lists, return a single sequence containing all of the elements of the combined list, as a generator. It also handles the more general case of a sequence of sequences, unlike sum(combinedList, []), which is the equivalent of the standard case. Example: >>> ', '.join(unpackList([['a', 'b'], ['c', 'd']])) 'a, b, c, d' >>> list(unpackList(splitList([1, 2, 3, 4]))) [1, 2, 3, 4]
|
Home | Trees | Indices | Help |
|
---|
Generated by Epydoc 3.0.1 on Mon Sep 8 15:46:42 2014 | http://epydoc.sourceforge.net |