Skip to content

Commit e49776b

Browse files
johnkerlnguyenv
andauthored
Update docstrings for new-shape feature (#250)
* Update docstrings for new-shape feature * Apply suggestions from code review Co-authored-by: nguyenv <vivian@tiledb.com> * Undo previous, since it does not work on Python 3.9 --------- Co-authored-by: nguyenv <vivian@tiledb.com>
1 parent a59ba8c commit e49776b

3 files changed

Lines changed: 86 additions & 16 deletions

File tree

python-spec/src/somacore/data.py

Lines changed: 70 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323

2424
from . import base
2525
from . import options
26+
from .types import StatusAndReason
2627

2728
_RO_AUTO = options.ResultOrder.AUTO
2829

@@ -71,18 +72,23 @@ def create(
7172
All named columns must exist in the schema, and at least one
7273
index column name is required.
7374
74-
domain: An optional sequence of tuples specifying the domain of each
75-
index column. Each tuple should be a pair consisting of
76-
the minimum and maximum values storable in the index column.
77-
For example, if there is a single int64-valued index column,
78-
then ``domain`` might be ``[(100, 200)]`` to indicate that
79-
values between 100 and 200, inclusive, can be stored in that
80-
column. If provided, this sequence must have the same length as
75+
domain:
76+
An optional sequence of tuples specifying the domain of each
77+
index column. Each tuple must be a pair consisting of the
78+
minimum and maximum values storable in the index column. For
79+
example, if there is a single int64-valued index column, then
80+
``domain`` might be ``[(100, 200)]`` to indicate that values
81+
between 100 and 200, inclusive, can be stored in that column.
82+
If provided, this sequence must have the same length as
8183
``index_column_names``, and the index-column domain will be as
8284
specified. If omitted entirely, or if ``None`` in a given
83-
dimension, the corresponding index-column domain will use
84-
the minimum and maximum possible values for the column's
85-
datatype. This makes a dataframe growable.
85+
dimension, the corresponding index-column domain will use an
86+
empty range, and data writes after that will fail with an
87+
exception. Unless you have a particular reason not to, you
88+
should always provide the desired `domain` at create time: this
89+
is an optional but strongly recommended parameter. See also
90+
``change_domain`` which allows you to expand the domain after
91+
create.
8692
8793
platform_config: platform-specific configuration; keys are SOMA
8894
implementation names.
@@ -164,6 +170,45 @@ def read(
164170
"""
165171
raise NotImplementedError()
166172

173+
@abc.abstractmethod
174+
def change_domain(
175+
self,
176+
newdomain: Optional[Sequence[Optional[Tuple[Any, Any]]]],
177+
check_only: bool = False,
178+
) -> StatusAndReason:
179+
"""Allows you to enlarge the domain of a SOMA :class:`DataFrame`, when
180+
the ``DataFrame`` already has a domain.
181+
182+
The argument must be a tuple of pairs of low/high values for the desired
183+
domain, one pair per index column. For string index columns, you must
184+
offer the low/high pair as `("", "")`, or as ``None``. If ``check_only``
185+
is ``True``, returns whether the operation would succeed if attempted,
186+
and a reason why it would not.
187+
188+
For example, suppose the dataframe's sole index-column name is
189+
``"soma_joinid"`` (which is the default at create). If the dataframe's
190+
``.maxdomain`` is ``((0, 999999),)`` and its ``.domain`` is ``((0,
191+
2899),)``, this means that ``soma_joinid`` values between 0 and 2899 can
192+
be read or written; any attempt to read or write ``soma_joinid`` values
193+
outside this range will result in an error. If you then apply
194+
``.change_domain([(0, 5700)])``, then ``.domain`` will
195+
report ``((0, 5699),)``, and now ``soma_joinid`` values in the range 0
196+
to 5699 can now be written to the dataframe.
197+
198+
If you use non-default ``index_column_names`` in the dataframe's
199+
``create`` then you need to specify the (low, high) pairs for each
200+
index column. For example, if the dataframe's ``index_column_names``
201+
is ``["soma_joinid", "cell_type"]``, then you can upgrade domain using
202+
``[(0, 5699), ("", "")]``.
203+
204+
Lastly, it is an error to try to set the ``domain`` to be smaller than
205+
``maxdomain`` along any index column. The ``maxdomain`` of a dataframe is
206+
set at creation time, and cannot be extended afterward.
207+
208+
Lirecycle: maturing
209+
"""
210+
raise NotImplementedError()
211+
167212
@abc.abstractmethod
168213
def write(
169214
self,
@@ -261,6 +306,21 @@ def create(
261306
"""
262307
raise NotImplementedError()
263308

309+
def resize(
310+
self, newshape: Sequence[Union[int, None]], check_only: bool = False
311+
) -> StatusAndReason:
312+
"""Increases the shape of the array as specfied. Raises an error if the new
313+
shape is less than the current shape in any dimension. Raises an error if
314+
the new shape exceeds maxshape in any dimension. Raises an error if the
315+
array doesn't already have a shape: in that case please call
316+
tiledbsoma_upgrade_shape. If ``check_only`` is ``True``, returns
317+
whether the operation would succeed if attempted, and a reason why it
318+
would not.
319+
320+
Lifecycle: maturing
321+
"""
322+
raise NotImplementedError()
323+
264324
# Metadata operations
265325

266326
@property

python-spec/src/somacore/spatial.py

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -85,12 +85,18 @@ def create(
8585
implementation, an error will be raised.
8686
coordinate_space: Either the coordinate space or the axis names for the
8787
coordinate space the point cloud is defined on.
88-
domain: An optional sequence of tuples specifying the domain of each
89-
index column. Each tuple should be a pair consisting of the minimum
90-
and maximum values storable in the index column. If omitted entirely,
91-
or if ``None`` in a given dimension, the corresponding index-column
92-
domain will use the minimum and maximum possible values for the
93-
column's datatype. This makes a point cloud dataframe growable.
88+
domain:
89+
An optional sequence of tuples specifying the domain of each
90+
index column. Each tuple must be a pair consisting of the
91+
minimum and maximum values storable in the index column.
92+
If provided, this sequence must have the same length as
93+
``index_column_names``, and the index-column domain will be as
94+
specified. If omitted entirely, or if ``None`` in a given
95+
dimension, the corresponding index-column domain will use an
96+
empty range, and data writes after that will fail with an
97+
exception. Unless you have a particular reason not to, you
98+
should always provide the desired `domain` at create time: this
99+
is an optional but strongly recommended parameter.
94100
platform_config: platform-specific configuration; keys are SOMA
95101
implementation names.
96102
context: Other implementation-specific configuration.

python-spec/src/somacore/types.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,10 @@
1717

1818
from typing_extensions import Protocol, TypeGuard
1919

20+
StatusAndReason = Tuple[bool, str]
21+
"""Information for whether an upgrade-shape or resize would succeed
22+
if attempted, along with a reason why not."""
23+
2024

2125
def is_nonstringy_sequence(it: object) -> TypeGuard[Sequence]:
2226
"""Returns true if a sequence is a "normal" sequence and not str or bytes.

0 commit comments

Comments
 (0)