Client

class deker.client.Client(uri='', *, executor=None, workers=None, write_lock_timeout=60, write_lock_check_interval=1, loglevel='ERROR', memory_limit=0, skip_collection_create_memory_check=False, **kwargs)

Bases: SelfLoggerMixin

Deker Client - is the first object user starts with.

It is used for creating and getting Collections and provides connection/path to Deker collections’ storage by uri. Local collection uri shall contain file:// schema and path to the collections storage on local machine. Connection to the storage is provided by a client-based context, which remains open while the Client is open, and vice-versa: while the context is open - the Client is open too.

Client has a context manager which opens and closes context itself:

with Client("file://...") as client:
    ~some important job here~

Anyway you may use Client directly:

client = Client("file://...")
~some important job here~
client.close()

As long as Client has a context manager, its instance is reusable:

client = Client("file://...")
~some important job here~
client.close()
with client:
    ~some important job here~
with client:
    ~some important job here~

Properties

  • is_closed

  • is_open

  • meta-version

  • root_path

API methods

  • create_collection: creates a new Collection on the storage and returns its object instance to work with. Requires:

    • collection unique name

    • an instance of ArraySchema or VArraySchema

    • chunking and compression options (optional); default is None

    • type of a storage adapter (optional); default is HDF5StorageAdapter

  • get_collection: returns an object of Collection by a given name if such exists, otherwise - None

  • check_integrity: checks the integrity of embedded storage database at different levels;

    Either performs all checks and prints found errors or exit on the first error. The final report may be saved to file.

  • calculate_storage_size: calculates size of the whole storage or of a defined Collection;

  • close: closes Client and its context

  • clear_locks: clears all current locks within the storage or a defined Collection

  • __enter__: opens Client context manager

  • __exit__: automatically closes Client context manager on its exit

  • __iter__: iterates over all the collections within the provided uri-path, yields Collection instances.

__enter__()
Return type

Client

__exit__(exc_type, exc_val, exc_tb)
Parameters
  • exc_type (Any) –

  • exc_val (Any) –

  • exc_tb (Any) –

Return type

None

__init__(uri='', *, executor=None, workers=None, write_lock_timeout=60, write_lock_check_interval=1, loglevel='ERROR', memory_limit=0, skip_collection_create_memory_check=False, **kwargs)

Deker client constructor.

Parameters
  • uri (str) – uri to Deker storage

  • executor (Optional[ThreadPoolExecutor]) – external ThreadPoolExecutor instance (optional)

  • workers (Optional[int]) – number of threads for Deker

  • write_lock_timeout (int) – An amount of seconds during which a parallel writing process waits for release of the locked file

  • write_lock_check_interval (int) – An amount of time (in seconds) during which a parallel writing process sleeps between checks for locks

  • loglevel (str) – Level of Deker loggers

  • memory_limit (Union[int, str]) –

    Limit of memory allocation per one array/subset in bytes or in human representation of kilobytes, megabytes or gigabytes, e.g. "100K", "512M", "4G". Human representations will be converted into bytes. If result is <= 0 - total RAM + total swap is used

    Note

    This parameter is used for early runtime break in case of potential memory overflow

  • skip_collection_create_memory_check (bool) – If we don’t want to check size during collection creation

  • kwargs (Any) – a wildcard, reserved for any extra parameters

Return type

None

__iter__()

Iterate over all collections in the storage.

Return type

Generator[Collection, None, None]

calculate_storage_size(collection_name='')

Get the size of the storage or of a certain collection in bytes or converted to human representation.

Warning

Size calculation may take a long time. Maybe you’d like to have some coffee while it’s working.

Parameters

collection_name (str) – Name of a Collection. If not passed, the whole storage will be counted.

Return type

StorageSize

check_integrity(level=1, stop_on_error=True, to_file=False, collection=None)

Run storage integrity check at one of 4 levels.

  1. checks Collections integrity. If no collection name was passed, iterates over all the Collections and initialises them one by one

  2. checks Arrays/VArrays initialization and lockfiles

  3. checks if Arrays/VArrays paths are valid, including symlinks

  4. checks if stored data is consistent with file-by-file one point reading

Parameters
  • collection (Optional[str]) – Name of a Collection. If passed - checks only passed collection, else checks every collection in the storage

  • level (int) – Check-level

  • stop_on_error (bool) – Flag to stop on first path or data error

  • to_file (Union[bool, Path, str]) – Dump errors in file; accepts True/False or a path to file. If True - dump errors into a default filename in the current directory; if a path to file is passed - dump errors to the file with a specified name and path.

Return type

None

clear_locks(collection_name=None)

Clear the readlocks of Arrays and/or VArrays.

Parameters

collection_name (Optional[str]) – Name of a Collection. If passed - clears locks only in the provided collection, else clears locks in every collection in the storage

Return type

None

close()

Close client.

Return type

None

collection_from_dict(collection_data)

Create a new Collection in the database from collection metadata dictionary.

Parameters

collection_data (dict) – Dictionary with collection metadata

Return type

Collection

create_collection(name, schema, collection_options=None, storage_adapter_type=None)

Create a new Collection in the database.

Parameters
  • name (str) – Name of new Collection

  • schema (Union[ArraySchema, VArraySchema]) – Array or VArray schema

  • collection_options (Optional[BaseCollectionOptions]) – Options for compression and chunks (if applicable)

  • storage_adapter_type (Optional[str]) – Type of an adapter, which works with files; default is HDF5StorageAdapter

Return type

Collection

get_collection(name)

Get Collection from database by its name.

Parameters

name (str) – Name of a Collection

Return type

Optional[Collection]

property is_closed: bool

Check client status.

property is_open: bool

Check client status.

property meta_version: str

Get actual metadata version, provided by local adapters.

property root_path: Path

Get root path to the current storage.

Parameters
  • uri (str) –

  • executor (Optional[ThreadPoolExecutor]) –

  • workers (Optional[int]) –

  • write_lock_timeout (int) –

  • write_lock_check_interval (int) –

  • loglevel (str) –

  • memory_limit (Union[int, str]) –

  • skip_collection_create_memory_check (bool) –

  • kwargs (Any) –