tiler
Github repository | Github issues | Documentation
This python package provides consistent and user-friendly functions for tiling/patching and subsequent merging of NumPy arrays.
Such tiling is often required for various heavy image-processing tasks such as semantic segmentation in deep learning, especially in domains where images do not fit into GPU memory (e.g., hyperspectral satellite images, whole slide images, videos, tomography data).
Please see Quick start section.
If you want to use tiler interactively, I highly recommend napari and napari-tiler plugin.
Features
- N-dimensional
- Optional in-place tiling
- Optional channel dimension (dimension that is not tiled)
- Optional tile batching
- Tile overlapping
- Access individual tiles with an iterator or a getter
- Tile merging, with optional window functions/tapering
Quick start
You can find more examples in examples.
For more Tiler and Merger functionality, please check documentation.
import numpy as np
from tiler import Tiler, Merger
image = np.random.random((3, 1920, 1080))
# Setup tiling parameters
tiler = Tiler(data_shape=image.shape,
tile_shape=(3, 250, 250),
channel_dimension=0)
## Access tiles:
# 1. with an iterator
for tile_id, tile in tiler.iterate(image):
print(f'Tile {tile_id} out of {len(tiler)} tiles.')
# 1b. the iterator can also be accessed through __call__
for tile_id, tile in tiler(image):
print(f'Tile {tile_id} out of {len(tiler)} tiles.')
# 2. individually
tile_3 = tiler.get_tile(image, 3)
# 3. in batches
tiles_in_batches = [batch for _, batch in tiler(image, batch_size=10)]
# Setup merging parameters
merger = Merger(tiler)
## Merge tiles:
# 1. one by one
for tile_id, tile in tiler(image):
merger.add(tile_id, some_processing_fn(tile))
# 2. in batches
merger.reset()
for batch_id, batch in tiler(image, batch_size=10):
merger.add_batch(batch_id, 10, batch)
# Final merging: applies tapering and optional unpadding
final_image = merger.merge(unpad=True) # (3, 1920, 1080)
Installation
The latest release is available through pip:
pip install tiler
Alternatively, you can clone the repository and install it manually:
git clone git@github.com:the-lay/tiler.git
cd tiler
pip install
If you are planning to contribute, please take a look at the contribution instructions.
Motivation & other packages
I work on semantic segmentation of patched 3D data and I often found myself reusing tiling functions that I wrote for the previous projects. No existing libraries listed below fit my use case, so that's why I wrote this library.
However, other libraries/examples might fit you better:
- vfdev-5/ImageTilingUtils
- Minimalistic image reader agnostic 2D tiling tools
-
- Powerful PyTorch toolset that has 2D image tiling and on-GPU merger
Vooban/Smoothly-Blend-Image-Patches
- Mirroring and D4 rotations data (8-fold) augmentation with squared spline window function for 2D images
-
- Slicing and merging 2D image into N equally sized tiles
-
- Tile and merge 2D, 3D images defined by tile shapes and step between tiles
Do you know any other similar packages?
- Please let me know by contacting me, making a PR or opening a new issue.
Moreover, some related approaches have been described in the literature:
- Introducing Hann windows for reducing edge-effects in patch-based image segmentation, Pielawski and Wählby, March 2020
Frequently asked questions
This section is a work in progress.
How do I create tiles with less dimensions than the data array?
Tiler expects tile_shape
to have less than or the same number of elements as data_shape
.
If tile_shape
has less elements than data_shape
, tile_shape
will be prepended with
ones to match the size of data_shape
.
For example, if you want to get 2d tiles out from 3d array you can initialize Tiler like this: Tiler(data_shape=(128,128,128), tile_shape=(128, 128))
and it will be equivalent to
Tiler(data_shape=(128,128,128), tile_shape=(1, 128, 128))
.
1import sys 2 3from tiler.merger import Merger 4from tiler.tiler import Tiler 5 6__all__ = ["Tiler", "Merger"] 7 8# Import README file as a module general docstring, only when generating documentation 9# We also modify it to make it prettier 10if "pdoc" in sys.modules: # pragma: no cover 11 with open("README.md", "r") as f: 12 _readme = f.read() 13 14 # remove baby logo and header 15 _readme = _readme.split("\n", 2)[2] 16 17 # replace teaser image path 18 _readme = _readme.replace("misc/teaser/tiler_teaser.png", "tiler_teaser.png") 19 _readme = _readme.replace("misc/baby_logo.png", "baby_logo.png") 20 __doc__ = _readme
10class Tiler: 11 TILING_MODES = ["constant", "drop", "irregular", "reflect", "edge", "wrap"] 12 r""" 13 Supported tiling modes: 14 - `constant` (default) 15 If a tile is smaller than `tile_shape`, pad it with the constant value along each axis to match `tile_shape`. 16 Set the value with the keyword `constant_value`. 17 - `drop` 18 If a tile is smaller than `tile_shape` in any of the dimensions, ignore it. Can result in zero tiles. 19 - `irregular` 20 Allow tiles to be smaller than `tile_shape`. 21 - `reflect` 22 If a tile is smaller than `tile_shape`, 23 pad it with the reflection of values along each axis to match `tile_shape`. 24 - `edge` 25 If a tile is smaller than `tile_shape`, 26 pad it with the edge values of data along each axis to match `tile_shape`. 27 - `wrap` 28 If a tile is smaller than `tile_shape`, 29 pad it with the wrap of the vector along each axis to match `tile_shape`. 30 The first values are used to pad the end and the end values are used to pad the beginning. 31 """ 32 33 def __init__( 34 self, 35 data_shape: Union[Tuple, List, np.ndarray], 36 tile_shape: Union[Tuple, List, np.ndarray], 37 overlap: Union[int, float, Tuple, List, np.ndarray] = 0, 38 channel_dimension: Optional[int] = None, 39 mode: str = "constant", 40 constant_value: float = 0.0, 41 ): 42 """Tiler class precomputes everything for tiling with specified parameters, without actually slicing data. 43 You can access tiles individually with `Tiler.get_tile()` or with an iterator, both individually and in batches, 44 with `Tiler.iterate()` (or the alias `Tiler.__call__()`). 45 46 Args: 47 data_shape (tuple, list or np.ndarray): Input data shape. 48 If there is a channel dimension, it should be included in the shape. 49 For example, (1920, 1080, 3), [512, 512, 512] or np.ndarray([3, 1024, 768]). 50 51 tile_shape (tuple, list or np.ndarray): Shape of a tile. 52 Tile must have the same number of dimensions as data or less. 53 If less, the shape will be automatically prepended with ones to match data_shape size. 54 For example, (256, 256, 3), [64, 64, 64] or np.ndarray([3, 128, 128]). 55 56 overlap (int, float, tuple, list or np.ndarray): Specifies overlap between tiles. 57 If integer, the same overlap of overlap pixels applied in each dimension, except channel_dimension. 58 If float, percentage of a tile_shape to overlap [0.0, 1.0), from 0% to 100% non-inclusive, except 59 channel_dimension. 60 If tuple, list or np.ndarray, explicit size of the overlap (must be smaller than tile_shape in each 61 dimension). 62 Default is `0`. 63 64 channel_dimension (int, optional): Specifies which axis is the channel dimension that will not be tiled. 65 Usually it is the last or the first dimension of the array. 66 Negative indexing (`-len(data_shape)` to `-1` inclusive) is allowed. 67 Default is `None`, no channel dimension in the data. 68 69 mode (str): Defines how the data will be tiled. 70 Must be one of the supported `Tiler.TILING_MODES`. Defaults to `"constant"`. 71 72 constant_value (float): Specifies the value of padding when `mode='constant'`. 73 Default is `0.0`. 74 """ 75 76 self.recalculate( 77 data_shape=data_shape, 78 tile_shape=tile_shape, 79 overlap=overlap, 80 channel_dimension=channel_dimension, 81 mode=mode, 82 constant_value=constant_value, 83 ) 84 85 def recalculate( 86 self, 87 data_shape: Optional[Union[Tuple, List, np.ndarray]] = None, 88 tile_shape: Optional[Union[Tuple, List, np.ndarray]] = None, 89 overlap: Optional[Union[int, float, Tuple, List, np.ndarray]] = None, 90 channel_dimension: Optional[int] = None, 91 mode: Optional[str] = None, 92 constant_value: Optional[float] = None, 93 ) -> None: 94 """Recalculates tiling for new given settings. 95 If a passed value is None, use previously given value. 96 97 For more information about each argument see `Tiler.__init__()` documentation. 98 """ 99 100 # Data and tile shapes 101 if data_shape is not None: 102 self.data_shape = np.asarray(data_shape, dtype=np.int64) 103 if tile_shape is not None: 104 self.tile_shape = np.atleast_1d(np.asarray(tile_shape, dtype=np.int64)) 105 106 # Append ones to match data_shape size 107 if self.tile_shape.size < self.data_shape.size: 108 size_difference = self.data_shape.size - self.tile_shape.size 109 self.tile_shape = np.insert(arr=self.tile_shape, obj=0, values=np.ones(size_difference), axis=0) 110 warnings.warn( 111 f"Tiler automatically adjusted tile_shape from {tuple(tile_shape)} to {tuple(self.tile_shape)}.", 112 stacklevel=2, 113 ) 114 self._n_dim: int = len(self.data_shape) 115 if (self.tile_shape <= 0).any() or (self.data_shape <= 0).any(): 116 raise ValueError("Tile and data shapes must be tuple, list or ndarray of positive integers.") 117 if self.tile_shape.size != self.data_shape.size: 118 raise ValueError( 119 "Tile shape must have less or equal number of elements compared to the data shape. " 120 "If less, your tile shape will be prepended with ones to match the data shape, " 121 "e.g. data_shape=(28, 28), tile_shape=(28) -> tile_shape=(1, 28)." 122 ) 123 124 # Tiling mode 125 if mode is not None: 126 self.mode = mode 127 if self.mode not in self.TILING_MODES: 128 raise ValueError(f"{self.mode} is an unsupported tiling mode, please check the documentation.") 129 130 # Constant value used for constant tiling mode 131 if constant_value is not None: 132 self.constant_value = constant_value 133 134 # Channel dimension can be None which means we need to check for init too 135 if not hasattr(self, "channel_dimension") or channel_dimension is not None: 136 self.channel_dimension = channel_dimension 137 if self.channel_dimension: 138 if (self.channel_dimension >= self._n_dim) or (self.channel_dimension < -self._n_dim): 139 raise ValueError( 140 f"Specified channel dimension is out of bounds " 141 f"(should be None or an integer from {-self._n_dim} to {self._n_dim - 1})." 142 ) 143 if self.channel_dimension < 0: 144 # negative indexing 145 self.channel_dimension = self._n_dim + self.channel_dimension 146 147 # Overlap and step 148 if overlap is not None: 149 self.overlap = overlap 150 if isinstance(self.overlap, float): 151 if self.overlap < 0 or self.overlap >= 1.0: 152 raise ValueError("Float overlap must be in range of [0.0, 1.0) i.e. [0%, 100%).") 153 154 self._tile_overlap: np.ndarray = np.ceil(self.overlap * self.tile_shape).astype(int) 155 if self.channel_dimension is not None: 156 self._tile_overlap[self.channel_dimension] = 0 157 158 elif isinstance(self.overlap, int): 159 tile_shape_without_channel = self.tile_shape[np.arange(self._n_dim) != self.channel_dimension] 160 if self.overlap < 0 or np.any(self.overlap >= tile_shape_without_channel): 161 raise ValueError(f"Integer overlap must be in range of 0 to {np.max(tile_shape_without_channel)}") 162 163 self._tile_overlap: np.ndarray = np.array([self.overlap for _ in self.tile_shape]) 164 if self.channel_dimension is not None: 165 self._tile_overlap[self.channel_dimension] = 0 166 167 elif isinstance(self.overlap, (list, tuple, np.ndarray)): 168 if np.any(np.array(self.overlap) < 0) or np.any(self.overlap >= self.tile_shape): 169 raise ValueError("Overlap size must be smaller than tile_shape.") 170 171 self._tile_overlap: np.ndarray = np.array(self.overlap).astype(int) 172 173 else: 174 raise ValueError("Unsupported overlap mode (not float, int, list, tuple or np.ndarray).") 175 176 self._tile_step: np.ndarray = (self.tile_shape - self._tile_overlap).astype(int) # tile step 177 178 # Calculate mosaic (collection of tiles) shape 179 div, mod = np.divmod( 180 [self.data_shape[d] - self._tile_overlap[d] for d in range(self._n_dim)], 181 self._tile_step, 182 ) 183 if self.mode == "drop": 184 self._indexing_shape = div 185 else: 186 self._indexing_shape = div + (mod != 0) 187 if self.channel_dimension is not None: 188 self._indexing_shape[self.channel_dimension] = 1 189 190 # Calculate new shape assuming tiles are padded 191 if self.mode == "irregular": 192 self._new_shape = self.data_shape 193 else: 194 self._new_shape = (self._indexing_shape * self._tile_step) + self._tile_overlap 195 self._shape_diff = self._new_shape - self.data_shape 196 if self.channel_dimension is not None: 197 self._shape_diff[self.channel_dimension] = 0 198 199 # If channel dimension is given, set tile_step of that dimension to 0 200 if self.channel_dimension is not None: 201 self._tile_step[self.channel_dimension] = 0 202 203 # Tile indexing 204 self._tile_index = np.vstack(np.meshgrid(*[np.arange(0, x) for x in self._indexing_shape], indexing="ij")) 205 self._tile_index = self._tile_index.reshape(self._n_dim, -1).T 206 self.n_tiles = len(self._tile_index) 207 208 if self.n_tiles == 0: 209 warnings.warn( 210 f"Tiler (mode={mode}, overlap={overlap}) will split data_shape {data_shape} " 211 f"into zero tiles (tile_shape={tile_shape}).", 212 stacklevel=2, 213 ) 214 215 def __len__(self) -> int: 216 """ 217 Returns: 218 int: Number of tiles in the mosaic. 219 """ 220 return self.n_tiles 221 222 def __repr__(self) -> str: 223 """ 224 Returns: 225 str: String representation of the object. 226 """ 227 return ( 228 f"Tiler split {self.data_shape.tolist()} data into {len(self)} tiles of {self.tile_shape.tolist()}." 229 f"\n\tMosaic shape: {self._indexing_shape.tolist()}" 230 f"\n\tPadded shape: {self._new_shape.tolist()}" 231 f"\n\tTile overlap: {self.overlap}" 232 f"\n\tElement step: {self._tile_step.tolist()}" 233 f"\n\tMode: {self.mode}" 234 f"\n\tChannel dimension: {self.channel_dimension}" 235 ) 236 237 def __call__( 238 self, 239 data: Union[np.ndarray, Callable[..., np.ndarray]], 240 progress_bar: bool = False, 241 batch_size: int = 0, 242 drop_last: bool = False, 243 copy_data: bool = True, 244 ) -> Generator[Tuple[int, np.ndarray], None, None]: 245 """Alias for `Tiler.iterate()`""" 246 return self.iterate(data, progress_bar, batch_size, drop_last, copy_data) 247 248 def iterate( 249 self, 250 data: Union[np.ndarray, Callable[..., np.ndarray]], 251 progress_bar: bool = False, 252 batch_size: int = 0, 253 drop_last: bool = False, 254 copy_data: bool = True, 255 ) -> Generator[Tuple[int, np.ndarray], None, None]: 256 """Iterates through tiles of the given data array. This method can also be accessed by `Tiler.__call__()`. 257 258 Args: 259 data (np.ndarray or callable): The data array on which the tiling will be performed. A callable can be 260 supplied to load data into memory instead of slicing from an array. The callable should take integers 261 as input, the smallest tile corner coordinates and tile size in each dimension, and output numpy array. 262 263 e.g. 264 *python-bioformats* 265 ```python 266 >>> tileSize = 2000 267 >>> tiler = Tiler((sizeX, sizeY, sizeC), (tileSize, tileSize, sizeC)) 268 >>> def reader_func(*args): 269 >>> X, Y, W, H = args[0], args[1], args[3], args[4] 270 >>> return reader.read(XYWH=[X, Y, W, H]) 271 >>> for t_id, tile in tiler.iterate(reader_func): 272 >>> pass 273 ``` 274 *open-slide* 275 ```python 276 >>> reader_func = lambda *args: wsi.read_region([args[0], args[1]], 0, [args[3], args[4]]) 277 >>> for t_id, tile in tiler.iterate(reader_func): 278 >>> pass 279 ``` 280 281 progress_bar (bool): Specifies whether to show the progress bar or not. 282 Uses `tqdm` package. 283 Default is `False`. 284 285 batch_size (int): Specifies returned batch size. 286 If `batch_size == 0`, return one tile at a time. 287 If `batch_size >= 1`, return in batches (returned shape: `[batch_size, *tile_shape]`). 288 Default is 0. 289 290 drop_last (bool): Specifies whether to drop last non-full batch. 291 Used only when batch_size > 0. 292 Default is False. 293 294 copy_data (bool): Specifies whether to copy the tile before returning it. 295 If `copy_data == False`, returns a view. 296 Default is True. 297 298 Yields: 299 (int, np.ndarray): Tuple with integer tile number and array tile data. 300 """ 301 302 if batch_size < 0: 303 raise ValueError(f"Batch size must >= 0, not {batch_size}") 304 305 # return a tile at a time 306 if batch_size == 0: 307 for tile_i in tqdm(range(self.n_tiles), disable=not progress_bar, unit=" tiles"): 308 yield tile_i, self.get_tile(data, tile_i, copy_data=copy_data) 309 310 # return in batches 311 if batch_size > 0: 312 # check for drop_last 313 length = (self.n_tiles - (self.n_tiles % batch_size)) if drop_last else self.n_tiles 314 315 for tile_i in tqdm(range(0, length, batch_size), disable=not progress_bar, unit=" batches"): 316 tiles = np.stack( 317 [ 318 self.get_tile(data, x, copy_data=copy_data) 319 for x in range(tile_i, min(tile_i + batch_size, length)) 320 ] 321 ) 322 yield tile_i // batch_size, tiles 323 324 def get_tile( 325 self, 326 data: Union[np.ndarray, Callable[..., np.ndarray]], 327 tile_id: int, 328 copy_data: bool = True, 329 ) -> np.ndarray: 330 """Returns an individual tile. 331 332 Args: 333 data (np.ndarray or callable): Data from which `tile_id`-th tile will be taken. A callable can be 334 supplied to load data into memory instead of slicing from an array. The callable should take integers 335 as input, the smallest tile corner coordinates and tile size in each dimension, and output numpy array. 336 337 e.g. 338 *python-bioformats* 339 ```python 340 >>> tileSize = 2000 341 >>> tiler = Tiler((sizeX, sizeY, sizeC), (tileSize, tileSize, sizeC)) 342 >>> def reader_func(*args): 343 >>> X, Y, W, H = args[0], args[1], args[3], args[4] 344 >>> return reader.read(XYWH=[X, Y, W, H]) 345 >>> tiler.get_tile(reader_func, 0) 346 ``` 347 *open-slide* 348 ```python 349 >>> reader_func = lambda *args: wsi.read_region([args[0], args[1]], 0, [args[3], args[4]]) 350 >>> tiler.get_tile(reader_func, 0) 351 ``` 352 353 tile_id (int): Specifies which tile to return. Must be smaller than the total number of tiles. 354 355 copy_data (bool): Specifies whether returned tile is a copy. 356 If `copy_data == False` returns a view. 357 Default is True. 358 359 Returns: 360 np.ndarray: Content of tile number `tile_id`, padded if necessary. 361 """ 362 363 if (tile_id < 0) or (tile_id >= self.n_tiles): 364 raise IndexError( 365 f"Out of bounds, there is no tile {tile_id}.There are {len(self) - 1} tiles, starting from index 0." 366 ) 367 368 if isinstance(data, np.ndarray) and np.not_equal(data.shape, self.data_shape).any(): 369 raise ValueError( 370 f"Shape of provided data array ({data.shape}) does not match " 371 f"same as Tiler's data_shape ({tuple(self.data_shape)})." 372 ) 373 374 # get tile data 375 tile_corner = self._tile_index[tile_id] * self._tile_step 376 # take the lesser of the tile shape and the distance to the edge 377 sampling = [ 378 slice( 379 tile_corner[d], 380 np.min([self.data_shape[d], tile_corner[d] + self.tile_shape[d]]), 381 ) 382 for d in range(self._n_dim) 383 ] 384 385 if callable(data): 386 sampling = [x.stop - x.start for x in sampling] 387 tile_data = data(*tile_corner, *sampling) 388 else: 389 tile_data = data[tuple(sampling)] 390 391 if copy_data: 392 tile_data = tile_data.copy() 393 394 shape_diff = self.tile_shape - tile_data.shape 395 if (self.mode != "irregular") and np.any(shape_diff > 0): 396 if self.mode == "constant": 397 tile_data = np.pad( 398 tile_data, 399 list((0, diff) for diff in shape_diff), 400 mode=self.mode, 401 constant_values=self.constant_value, 402 ) 403 elif self.mode == "reflect" or self.mode == "edge" or self.mode == "wrap": 404 tile_data = np.pad(tile_data, list((0, diff) for diff in shape_diff), mode=self.mode) 405 406 return tile_data 407 408 def get_all_tiles( 409 self, 410 data: Union[np.ndarray, Callable[..., np.ndarray]], 411 axis: int = 0, 412 copy_data: bool = True, 413 ) -> np.ndarray: 414 """Returns all tiles joined along a new axis. Does not work for `Tiler.mode = 'irregular'`. 415 416 The `axis` parameter specifies the index of the new axis in the dimensions of the result. 417 For example, if `axis=0` it will be the first dimension and if `axis=-1` it will be the last dimension. 418 419 For more information about `data` and `copy_data` parameters, see `Tiler.get_tile()`. 420 421 Args: 422 data (np.ndarray or callable): Data which will be tiled. A callable can be supplied to load data into memory 423 instead of slicing from an array. The callable should take integers as input, the smallest tile corner 424 coordinates and tile size in each dimension, and output numpy array. 425 426 axis (int): The axis in the result array along which the tiles are stacked. 427 428 copy_data (bool): Specifies whether returned tile is a copy. 429 If `copy_data == False` returns a view. 430 Default is True. 431 432 Returns: 433 np.ndarray: All tiles stacked along a new axis. 434 """ 435 436 if self.mode == "irregular": 437 raise ValueError("get_all_tiles does not support irregular mode") 438 439 return np.stack( 440 [self.get_tile(data, x, copy_data=copy_data) for x in range(self.n_tiles)], 441 axis=axis, 442 ) 443 444 def get_tile_bbox( 445 self, 446 tile_id: int, 447 with_channel_dim: bool = False, 448 all_corners: bool = False, 449 ) -> Union[Tuple[np.ndarray, np.ndarray], np.ndarray]: 450 """Returns coordinates of the opposite corners of the bounding box (hyperrectangle?) of the tile on padded data. 451 452 Args: 453 tile_id (int): Specifies which tile's bounding coordinates will be returned. 454 Must be between 0 and the total number of tiles. 455 456 with_channel_dim (bool): Specifies whether to return shape with channel dimension or without. 457 Default is False. 458 459 all_corners (bool): If True, returns all vertices of the bounding box. 460 Default is False. 461 462 Returns: 463 (np.ndarray, np.ndarray): Smallest (bottom-left) and largest (top-right) corners of the bounding box. 464 465 np.ndarray: All corners of the bounding box, if `all_corners=True`. 466 """ 467 468 if (tile_id < 0) or (tile_id >= self.n_tiles): 469 raise IndexError( 470 f"Out of bounds, there is no tile {tile_id}. There are {len(self) - 1} tiles, starting from index 0." 471 ) 472 473 # find min and max vertices 474 bottom_left_corner = self._tile_step * self.get_tile_mosaic_position(tile_id, True) 475 top_right_corner = bottom_left_corner + self.tile_shape 476 477 # remove channel dimension if not required 478 if self.channel_dimension is not None and not with_channel_dim: 479 dim_indices = list(range(self.channel_dimension)) + list( 480 range(self.channel_dimension + 1, len(self._tile_step)) 481 ) 482 bottom_left_corner = bottom_left_corner[dim_indices] 483 top_right_corner = top_right_corner[dim_indices] 484 485 # by default, return only min/max vertices 486 if not all_corners: 487 return bottom_left_corner, top_right_corner 488 489 # otherwise, return all vertices of the bbox 490 # inspired by https://stackoverflow.com/a/57065356/1668421 491 # but instead create an indexing array from cartesian product of bits 492 # and use it to sample intervals 493 else: 494 n_dim: int = len(bottom_left_corner) # already channel_dimension adjusted 495 mins = np.minimum(bottom_left_corner, top_right_corner) 496 maxs = np.maximum(bottom_left_corner, top_right_corner) 497 intervals = np.stack([mins, maxs], -1) 498 indexing = np.array(list(itertools.product([0, 1], repeat=n_dim))) 499 corners = np.stack([intervals[x][indexing.T[x]] for x in range(n_dim)], -1) 500 return corners 501 502 def get_tile_mosaic_position(self, tile_id: int, with_channel_dim: bool = False) -> np.ndarray: 503 """Returns tile position in the mosaic. 504 505 Args: 506 tile_id (int): Specifies which tile's mosaic position will be returned. \ 507 Must be smaller than the total number of tiles. 508 509 with_channel_dim (bool): Specifies whether to return position with channel dimension or without. 510 Default is False. 511 512 Returns: 513 np.ndarray: Tile mosaic position (tile position relative to other tiles). 514 """ 515 if (tile_id < 0) or (tile_id >= self.n_tiles): 516 raise IndexError( 517 f"Out of bounds, there is no tile {tile_id}. There are {len(self) - 1} tiles, starting from index 0." 518 ) 519 520 if self.channel_dimension is not None and not with_channel_dim: 521 return self._tile_index[tile_id][~(np.arange(self._n_dim) == self.channel_dimension)] 522 return self._tile_index[tile_id] 523 524 def get_mosaic_shape(self, with_channel_dim: bool = False) -> np.ndarray: 525 """Returns mosaic shape. 526 527 Args: 528 with_channel_dim (bool): 529 Specifies whether to return shape with channel dimension or without. Defaults to False. 530 531 Returns: 532 np.ndarray: Shape of tiles mosaic. 533 """ 534 if self.channel_dimension is not None and not with_channel_dim: 535 return self._indexing_shape[~(np.arange(self._n_dim) == self.channel_dimension)] 536 return self._indexing_shape 537 538 def calculate_padding(self) -> Tuple[np.ndarray, List[Tuple[int, int]]]: 539 """Calculate a frame padding for the current Tiler parameters. 540 The padding is overlap//2 or tile_step//2, whichever is bigger. 541 The method returns a tuple (new_shape, padding) where padding is 542 ((before_1, after_1), … (before_N, after_N)), unique pad widths for each axis N. 543 544 In the usual workflow, you'd recalculate tiling settings and then apply padding, prior to tiling. 545 Then when merging, pass padding to `Merger.merge(extra_padding=padding, ...)`: 546 ```python 547 >>> tiler = Tiler(...) 548 >>> merger = Merger(tiler, ...) 549 >>> new_shape, padding = tiler.calculate_padding() 550 >>> tiler.recalculate(data_shape=new_shape) 551 >>> padded_data = np.pad(data, pad_width=padding, mode="reflect") 552 >>> for tile_id, tile in tiler(padded_data): 553 >>> processed_tile = process(tile) 554 >>> merger.add(tile_id, processed_tile) 555 >>> final_image = merger.merge(extra_padding=padding) 556 ``` 557 Return: 558 np.ndarray: Resulting shape when padding is applied. 559 560 List[Tuple[int, int]]: Calculated padding. 561 """ 562 563 # Choosing padding 564 pre_pad = np.maximum(self._tile_step // 2, self._tile_overlap // 2) 565 post_pad = pre_pad + np.mod(self._tile_step, 2) 566 567 new_shape = pre_pad + self.data_shape + post_pad 568 padding = list(zip(pre_pad, post_pad, strict=False)) 569 570 return new_shape, padding
33 def __init__( 34 self, 35 data_shape: Union[Tuple, List, np.ndarray], 36 tile_shape: Union[Tuple, List, np.ndarray], 37 overlap: Union[int, float, Tuple, List, np.ndarray] = 0, 38 channel_dimension: Optional[int] = None, 39 mode: str = "constant", 40 constant_value: float = 0.0, 41 ): 42 """Tiler class precomputes everything for tiling with specified parameters, without actually slicing data. 43 You can access tiles individually with `Tiler.get_tile()` or with an iterator, both individually and in batches, 44 with `Tiler.iterate()` (or the alias `Tiler.__call__()`). 45 46 Args: 47 data_shape (tuple, list or np.ndarray): Input data shape. 48 If there is a channel dimension, it should be included in the shape. 49 For example, (1920, 1080, 3), [512, 512, 512] or np.ndarray([3, 1024, 768]). 50 51 tile_shape (tuple, list or np.ndarray): Shape of a tile. 52 Tile must have the same number of dimensions as data or less. 53 If less, the shape will be automatically prepended with ones to match data_shape size. 54 For example, (256, 256, 3), [64, 64, 64] or np.ndarray([3, 128, 128]). 55 56 overlap (int, float, tuple, list or np.ndarray): Specifies overlap between tiles. 57 If integer, the same overlap of overlap pixels applied in each dimension, except channel_dimension. 58 If float, percentage of a tile_shape to overlap [0.0, 1.0), from 0% to 100% non-inclusive, except 59 channel_dimension. 60 If tuple, list or np.ndarray, explicit size of the overlap (must be smaller than tile_shape in each 61 dimension). 62 Default is `0`. 63 64 channel_dimension (int, optional): Specifies which axis is the channel dimension that will not be tiled. 65 Usually it is the last or the first dimension of the array. 66 Negative indexing (`-len(data_shape)` to `-1` inclusive) is allowed. 67 Default is `None`, no channel dimension in the data. 68 69 mode (str): Defines how the data will be tiled. 70 Must be one of the supported `Tiler.TILING_MODES`. Defaults to `"constant"`. 71 72 constant_value (float): Specifies the value of padding when `mode='constant'`. 73 Default is `0.0`. 74 """ 75 76 self.recalculate( 77 data_shape=data_shape, 78 tile_shape=tile_shape, 79 overlap=overlap, 80 channel_dimension=channel_dimension, 81 mode=mode, 82 constant_value=constant_value, 83 )
Tiler class precomputes everything for tiling with specified parameters, without actually slicing data.
You can access tiles individually with Tiler.get_tile()
or with an iterator, both individually and in batches,
with Tiler.iterate()
(or the alias Tiler.__call__()
).
Arguments:
- data_shape (tuple, list or np.ndarray): Input data shape. If there is a channel dimension, it should be included in the shape. For example, (1920, 1080, 3), [512, 512, 512] or np.ndarray([3, 1024, 768]).
- tile_shape (tuple, list or np.ndarray): Shape of a tile. Tile must have the same number of dimensions as data or less. If less, the shape will be automatically prepended with ones to match data_shape size. For example, (256, 256, 3), [64, 64, 64] or np.ndarray([3, 128, 128]).
- overlap (int, float, tuple, list or np.ndarray): Specifies overlap between tiles.
If integer, the same overlap of overlap pixels applied in each dimension, except channel_dimension.
If float, percentage of a tile_shape to overlap [0.0, 1.0), from 0% to 100% non-inclusive, except
channel_dimension.
If tuple, list or np.ndarray, explicit size of the overlap (must be smaller than tile_shape in each
dimension).
Default is
0
. - channel_dimension (int, optional): Specifies which axis is the channel dimension that will not be tiled.
Usually it is the last or the first dimension of the array.
Negative indexing (
-len(data_shape)
to-1
inclusive) is allowed. Default isNone
, no channel dimension in the data. - mode (str): Defines how the data will be tiled.
Must be one of the supported
Tiler.TILING_MODES
. Defaults to"constant"
. - constant_value (float): Specifies the value of padding when
mode='constant'
. Default is0.0
.
Supported tiling modes:
constant
(default) If a tile is smaller thantile_shape
, pad it with the constant value along each axis to matchtile_shape
. Set the value with the keywordconstant_value
.drop
If a tile is smaller thantile_shape
in any of the dimensions, ignore it. Can result in zero tiles.irregular
Allow tiles to be smaller thantile_shape
.reflect
If a tile is smaller thantile_shape
, pad it with the reflection of values along each axis to matchtile_shape
.edge
If a tile is smaller thantile_shape
, pad it with the edge values of data along each axis to matchtile_shape
.wrap
If a tile is smaller thantile_shape
, pad it with the wrap of the vector along each axis to matchtile_shape
. The first values are used to pad the end and the end values are used to pad the beginning.
85 def recalculate( 86 self, 87 data_shape: Optional[Union[Tuple, List, np.ndarray]] = None, 88 tile_shape: Optional[Union[Tuple, List, np.ndarray]] = None, 89 overlap: Optional[Union[int, float, Tuple, List, np.ndarray]] = None, 90 channel_dimension: Optional[int] = None, 91 mode: Optional[str] = None, 92 constant_value: Optional[float] = None, 93 ) -> None: 94 """Recalculates tiling for new given settings. 95 If a passed value is None, use previously given value. 96 97 For more information about each argument see `Tiler.__init__()` documentation. 98 """ 99 100 # Data and tile shapes 101 if data_shape is not None: 102 self.data_shape = np.asarray(data_shape, dtype=np.int64) 103 if tile_shape is not None: 104 self.tile_shape = np.atleast_1d(np.asarray(tile_shape, dtype=np.int64)) 105 106 # Append ones to match data_shape size 107 if self.tile_shape.size < self.data_shape.size: 108 size_difference = self.data_shape.size - self.tile_shape.size 109 self.tile_shape = np.insert(arr=self.tile_shape, obj=0, values=np.ones(size_difference), axis=0) 110 warnings.warn( 111 f"Tiler automatically adjusted tile_shape from {tuple(tile_shape)} to {tuple(self.tile_shape)}.", 112 stacklevel=2, 113 ) 114 self._n_dim: int = len(self.data_shape) 115 if (self.tile_shape <= 0).any() or (self.data_shape <= 0).any(): 116 raise ValueError("Tile and data shapes must be tuple, list or ndarray of positive integers.") 117 if self.tile_shape.size != self.data_shape.size: 118 raise ValueError( 119 "Tile shape must have less or equal number of elements compared to the data shape. " 120 "If less, your tile shape will be prepended with ones to match the data shape, " 121 "e.g. data_shape=(28, 28), tile_shape=(28) -> tile_shape=(1, 28)." 122 ) 123 124 # Tiling mode 125 if mode is not None: 126 self.mode = mode 127 if self.mode not in self.TILING_MODES: 128 raise ValueError(f"{self.mode} is an unsupported tiling mode, please check the documentation.") 129 130 # Constant value used for constant tiling mode 131 if constant_value is not None: 132 self.constant_value = constant_value 133 134 # Channel dimension can be None which means we need to check for init too 135 if not hasattr(self, "channel_dimension") or channel_dimension is not None: 136 self.channel_dimension = channel_dimension 137 if self.channel_dimension: 138 if (self.channel_dimension >= self._n_dim) or (self.channel_dimension < -self._n_dim): 139 raise ValueError( 140 f"Specified channel dimension is out of bounds " 141 f"(should be None or an integer from {-self._n_dim} to {self._n_dim - 1})." 142 ) 143 if self.channel_dimension < 0: 144 # negative indexing 145 self.channel_dimension = self._n_dim + self.channel_dimension 146 147 # Overlap and step 148 if overlap is not None: 149 self.overlap = overlap 150 if isinstance(self.overlap, float): 151 if self.overlap < 0 or self.overlap >= 1.0: 152 raise ValueError("Float overlap must be in range of [0.0, 1.0) i.e. [0%, 100%).") 153 154 self._tile_overlap: np.ndarray = np.ceil(self.overlap * self.tile_shape).astype(int) 155 if self.channel_dimension is not None: 156 self._tile_overlap[self.channel_dimension] = 0 157 158 elif isinstance(self.overlap, int): 159 tile_shape_without_channel = self.tile_shape[np.arange(self._n_dim) != self.channel_dimension] 160 if self.overlap < 0 or np.any(self.overlap >= tile_shape_without_channel): 161 raise ValueError(f"Integer overlap must be in range of 0 to {np.max(tile_shape_without_channel)}") 162 163 self._tile_overlap: np.ndarray = np.array([self.overlap for _ in self.tile_shape]) 164 if self.channel_dimension is not None: 165 self._tile_overlap[self.channel_dimension] = 0 166 167 elif isinstance(self.overlap, (list, tuple, np.ndarray)): 168 if np.any(np.array(self.overlap) < 0) or np.any(self.overlap >= self.tile_shape): 169 raise ValueError("Overlap size must be smaller than tile_shape.") 170 171 self._tile_overlap: np.ndarray = np.array(self.overlap).astype(int) 172 173 else: 174 raise ValueError("Unsupported overlap mode (not float, int, list, tuple or np.ndarray).") 175 176 self._tile_step: np.ndarray = (self.tile_shape - self._tile_overlap).astype(int) # tile step 177 178 # Calculate mosaic (collection of tiles) shape 179 div, mod = np.divmod( 180 [self.data_shape[d] - self._tile_overlap[d] for d in range(self._n_dim)], 181 self._tile_step, 182 ) 183 if self.mode == "drop": 184 self._indexing_shape = div 185 else: 186 self._indexing_shape = div + (mod != 0) 187 if self.channel_dimension is not None: 188 self._indexing_shape[self.channel_dimension] = 1 189 190 # Calculate new shape assuming tiles are padded 191 if self.mode == "irregular": 192 self._new_shape = self.data_shape 193 else: 194 self._new_shape = (self._indexing_shape * self._tile_step) + self._tile_overlap 195 self._shape_diff = self._new_shape - self.data_shape 196 if self.channel_dimension is not None: 197 self._shape_diff[self.channel_dimension] = 0 198 199 # If channel dimension is given, set tile_step of that dimension to 0 200 if self.channel_dimension is not None: 201 self._tile_step[self.channel_dimension] = 0 202 203 # Tile indexing 204 self._tile_index = np.vstack(np.meshgrid(*[np.arange(0, x) for x in self._indexing_shape], indexing="ij")) 205 self._tile_index = self._tile_index.reshape(self._n_dim, -1).T 206 self.n_tiles = len(self._tile_index) 207 208 if self.n_tiles == 0: 209 warnings.warn( 210 f"Tiler (mode={mode}, overlap={overlap}) will split data_shape {data_shape} " 211 f"into zero tiles (tile_shape={tile_shape}).", 212 stacklevel=2, 213 )
Recalculates tiling for new given settings. If a passed value is None, use previously given value.
For more information about each argument see Tiler.__init__()
documentation.
248 def iterate( 249 self, 250 data: Union[np.ndarray, Callable[..., np.ndarray]], 251 progress_bar: bool = False, 252 batch_size: int = 0, 253 drop_last: bool = False, 254 copy_data: bool = True, 255 ) -> Generator[Tuple[int, np.ndarray], None, None]: 256 """Iterates through tiles of the given data array. This method can also be accessed by `Tiler.__call__()`. 257 258 Args: 259 data (np.ndarray or callable): The data array on which the tiling will be performed. A callable can be 260 supplied to load data into memory instead of slicing from an array. The callable should take integers 261 as input, the smallest tile corner coordinates and tile size in each dimension, and output numpy array. 262 263 e.g. 264 *python-bioformats* 265 ```python 266 >>> tileSize = 2000 267 >>> tiler = Tiler((sizeX, sizeY, sizeC), (tileSize, tileSize, sizeC)) 268 >>> def reader_func(*args): 269 >>> X, Y, W, H = args[0], args[1], args[3], args[4] 270 >>> return reader.read(XYWH=[X, Y, W, H]) 271 >>> for t_id, tile in tiler.iterate(reader_func): 272 >>> pass 273 ``` 274 *open-slide* 275 ```python 276 >>> reader_func = lambda *args: wsi.read_region([args[0], args[1]], 0, [args[3], args[4]]) 277 >>> for t_id, tile in tiler.iterate(reader_func): 278 >>> pass 279 ``` 280 281 progress_bar (bool): Specifies whether to show the progress bar or not. 282 Uses `tqdm` package. 283 Default is `False`. 284 285 batch_size (int): Specifies returned batch size. 286 If `batch_size == 0`, return one tile at a time. 287 If `batch_size >= 1`, return in batches (returned shape: `[batch_size, *tile_shape]`). 288 Default is 0. 289 290 drop_last (bool): Specifies whether to drop last non-full batch. 291 Used only when batch_size > 0. 292 Default is False. 293 294 copy_data (bool): Specifies whether to copy the tile before returning it. 295 If `copy_data == False`, returns a view. 296 Default is True. 297 298 Yields: 299 (int, np.ndarray): Tuple with integer tile number and array tile data. 300 """ 301 302 if batch_size < 0: 303 raise ValueError(f"Batch size must >= 0, not {batch_size}") 304 305 # return a tile at a time 306 if batch_size == 0: 307 for tile_i in tqdm(range(self.n_tiles), disable=not progress_bar, unit=" tiles"): 308 yield tile_i, self.get_tile(data, tile_i, copy_data=copy_data) 309 310 # return in batches 311 if batch_size > 0: 312 # check for drop_last 313 length = (self.n_tiles - (self.n_tiles % batch_size)) if drop_last else self.n_tiles 314 315 for tile_i in tqdm(range(0, length, batch_size), disable=not progress_bar, unit=" batches"): 316 tiles = np.stack( 317 [ 318 self.get_tile(data, x, copy_data=copy_data) 319 for x in range(tile_i, min(tile_i + batch_size, length)) 320 ] 321 ) 322 yield tile_i // batch_size, tiles
Iterates through tiles of the given data array. This method can also be accessed by Tiler.__call__()
.
Arguments:
data (np.ndarray or callable): The data array on which the tiling will be performed. A callable can be supplied to load data into memory instead of slicing from an array. The callable should take integers as input, the smallest tile corner coordinates and tile size in each dimension, and output numpy array.
e.g. python-bioformats
>>> tileSize = 2000 >>> tiler = Tiler((sizeX, sizeY, sizeC), (tileSize, tileSize, sizeC)) >>> def reader_func(*args): >>> X, Y, W, H = args[0], args[1], args[3], args[4] >>> return reader.read(XYWH=[X, Y, W, H]) >>> for t_id, tile in tiler.iterate(reader_func): >>> pass
open-slide
>>> reader_func = lambda *args: wsi.read_region([args[0], args[1]], 0, [args[3], args[4]]) >>> for t_id, tile in tiler.iterate(reader_func): >>> pass
progress_bar (bool): Specifies whether to show the progress bar or not. Uses
tqdm
package. Default isFalse
.- batch_size (int): Specifies returned batch size.
If
batch_size == 0
, return one tile at a time. Ifbatch_size >= 1
, return in batches (returned shape:[batch_size, *tile_shape]
). Default is 0. - drop_last (bool): Specifies whether to drop last non-full batch. Used only when batch_size > 0. Default is False.
- copy_data (bool): Specifies whether to copy the tile before returning it.
If
copy_data == False
, returns a view. Default is True.
Yields:
(int, np.ndarray): Tuple with integer tile number and array tile data.
324 def get_tile( 325 self, 326 data: Union[np.ndarray, Callable[..., np.ndarray]], 327 tile_id: int, 328 copy_data: bool = True, 329 ) -> np.ndarray: 330 """Returns an individual tile. 331 332 Args: 333 data (np.ndarray or callable): Data from which `tile_id`-th tile will be taken. A callable can be 334 supplied to load data into memory instead of slicing from an array. The callable should take integers 335 as input, the smallest tile corner coordinates and tile size in each dimension, and output numpy array. 336 337 e.g. 338 *python-bioformats* 339 ```python 340 >>> tileSize = 2000 341 >>> tiler = Tiler((sizeX, sizeY, sizeC), (tileSize, tileSize, sizeC)) 342 >>> def reader_func(*args): 343 >>> X, Y, W, H = args[0], args[1], args[3], args[4] 344 >>> return reader.read(XYWH=[X, Y, W, H]) 345 >>> tiler.get_tile(reader_func, 0) 346 ``` 347 *open-slide* 348 ```python 349 >>> reader_func = lambda *args: wsi.read_region([args[0], args[1]], 0, [args[3], args[4]]) 350 >>> tiler.get_tile(reader_func, 0) 351 ``` 352 353 tile_id (int): Specifies which tile to return. Must be smaller than the total number of tiles. 354 355 copy_data (bool): Specifies whether returned tile is a copy. 356 If `copy_data == False` returns a view. 357 Default is True. 358 359 Returns: 360 np.ndarray: Content of tile number `tile_id`, padded if necessary. 361 """ 362 363 if (tile_id < 0) or (tile_id >= self.n_tiles): 364 raise IndexError( 365 f"Out of bounds, there is no tile {tile_id}.There are {len(self) - 1} tiles, starting from index 0." 366 ) 367 368 if isinstance(data, np.ndarray) and np.not_equal(data.shape, self.data_shape).any(): 369 raise ValueError( 370 f"Shape of provided data array ({data.shape}) does not match " 371 f"same as Tiler's data_shape ({tuple(self.data_shape)})." 372 ) 373 374 # get tile data 375 tile_corner = self._tile_index[tile_id] * self._tile_step 376 # take the lesser of the tile shape and the distance to the edge 377 sampling = [ 378 slice( 379 tile_corner[d], 380 np.min([self.data_shape[d], tile_corner[d] + self.tile_shape[d]]), 381 ) 382 for d in range(self._n_dim) 383 ] 384 385 if callable(data): 386 sampling = [x.stop - x.start for x in sampling] 387 tile_data = data(*tile_corner, *sampling) 388 else: 389 tile_data = data[tuple(sampling)] 390 391 if copy_data: 392 tile_data = tile_data.copy() 393 394 shape_diff = self.tile_shape - tile_data.shape 395 if (self.mode != "irregular") and np.any(shape_diff > 0): 396 if self.mode == "constant": 397 tile_data = np.pad( 398 tile_data, 399 list((0, diff) for diff in shape_diff), 400 mode=self.mode, 401 constant_values=self.constant_value, 402 ) 403 elif self.mode == "reflect" or self.mode == "edge" or self.mode == "wrap": 404 tile_data = np.pad(tile_data, list((0, diff) for diff in shape_diff), mode=self.mode) 405 406 return tile_data
Returns an individual tile.
Arguments:
data (np.ndarray or callable): Data from which
tile_id
-th tile will be taken. A callable can be supplied to load data into memory instead of slicing from an array. The callable should take integers as input, the smallest tile corner coordinates and tile size in each dimension, and output numpy array.e.g. python-bioformats
>>> tileSize = 2000 >>> tiler = Tiler((sizeX, sizeY, sizeC), (tileSize, tileSize, sizeC)) >>> def reader_func(*args): >>> X, Y, W, H = args[0], args[1], args[3], args[4] >>> return reader.read(XYWH=[X, Y, W, H]) >>> tiler.get_tile(reader_func, 0)
open-slide
>>> reader_func = lambda *args: wsi.read_region([args[0], args[1]], 0, [args[3], args[4]]) >>> tiler.get_tile(reader_func, 0)
tile_id (int): Specifies which tile to return. Must be smaller than the total number of tiles.
- copy_data (bool): Specifies whether returned tile is a copy.
If
copy_data == False
returns a view. Default is True.
Returns:
np.ndarray: Content of tile number
tile_id
, padded if necessary.
408 def get_all_tiles( 409 self, 410 data: Union[np.ndarray, Callable[..., np.ndarray]], 411 axis: int = 0, 412 copy_data: bool = True, 413 ) -> np.ndarray: 414 """Returns all tiles joined along a new axis. Does not work for `Tiler.mode = 'irregular'`. 415 416 The `axis` parameter specifies the index of the new axis in the dimensions of the result. 417 For example, if `axis=0` it will be the first dimension and if `axis=-1` it will be the last dimension. 418 419 For more information about `data` and `copy_data` parameters, see `Tiler.get_tile()`. 420 421 Args: 422 data (np.ndarray or callable): Data which will be tiled. A callable can be supplied to load data into memory 423 instead of slicing from an array. The callable should take integers as input, the smallest tile corner 424 coordinates and tile size in each dimension, and output numpy array. 425 426 axis (int): The axis in the result array along which the tiles are stacked. 427 428 copy_data (bool): Specifies whether returned tile is a copy. 429 If `copy_data == False` returns a view. 430 Default is True. 431 432 Returns: 433 np.ndarray: All tiles stacked along a new axis. 434 """ 435 436 if self.mode == "irregular": 437 raise ValueError("get_all_tiles does not support irregular mode") 438 439 return np.stack( 440 [self.get_tile(data, x, copy_data=copy_data) for x in range(self.n_tiles)], 441 axis=axis, 442 )
Returns all tiles joined along a new axis. Does not work for Tiler.mode = 'irregular'
.
The axis
parameter specifies the index of the new axis in the dimensions of the result.
For example, if axis=0
it will be the first dimension and if axis=-1
it will be the last dimension.
For more information about data
and copy_data
parameters, see Tiler.get_tile()
.
Arguments:
- data (np.ndarray or callable): Data which will be tiled. A callable can be supplied to load data into memory instead of slicing from an array. The callable should take integers as input, the smallest tile corner coordinates and tile size in each dimension, and output numpy array.
- axis (int): The axis in the result array along which the tiles are stacked.
- copy_data (bool): Specifies whether returned tile is a copy.
If
copy_data == False
returns a view. Default is True.
Returns:
np.ndarray: All tiles stacked along a new axis.
444 def get_tile_bbox( 445 self, 446 tile_id: int, 447 with_channel_dim: bool = False, 448 all_corners: bool = False, 449 ) -> Union[Tuple[np.ndarray, np.ndarray], np.ndarray]: 450 """Returns coordinates of the opposite corners of the bounding box (hyperrectangle?) of the tile on padded data. 451 452 Args: 453 tile_id (int): Specifies which tile's bounding coordinates will be returned. 454 Must be between 0 and the total number of tiles. 455 456 with_channel_dim (bool): Specifies whether to return shape with channel dimension or without. 457 Default is False. 458 459 all_corners (bool): If True, returns all vertices of the bounding box. 460 Default is False. 461 462 Returns: 463 (np.ndarray, np.ndarray): Smallest (bottom-left) and largest (top-right) corners of the bounding box. 464 465 np.ndarray: All corners of the bounding box, if `all_corners=True`. 466 """ 467 468 if (tile_id < 0) or (tile_id >= self.n_tiles): 469 raise IndexError( 470 f"Out of bounds, there is no tile {tile_id}. There are {len(self) - 1} tiles, starting from index 0." 471 ) 472 473 # find min and max vertices 474 bottom_left_corner = self._tile_step * self.get_tile_mosaic_position(tile_id, True) 475 top_right_corner = bottom_left_corner + self.tile_shape 476 477 # remove channel dimension if not required 478 if self.channel_dimension is not None and not with_channel_dim: 479 dim_indices = list(range(self.channel_dimension)) + list( 480 range(self.channel_dimension + 1, len(self._tile_step)) 481 ) 482 bottom_left_corner = bottom_left_corner[dim_indices] 483 top_right_corner = top_right_corner[dim_indices] 484 485 # by default, return only min/max vertices 486 if not all_corners: 487 return bottom_left_corner, top_right_corner 488 489 # otherwise, return all vertices of the bbox 490 # inspired by https://stackoverflow.com/a/57065356/1668421 491 # but instead create an indexing array from cartesian product of bits 492 # and use it to sample intervals 493 else: 494 n_dim: int = len(bottom_left_corner) # already channel_dimension adjusted 495 mins = np.minimum(bottom_left_corner, top_right_corner) 496 maxs = np.maximum(bottom_left_corner, top_right_corner) 497 intervals = np.stack([mins, maxs], -1) 498 indexing = np.array(list(itertools.product([0, 1], repeat=n_dim))) 499 corners = np.stack([intervals[x][indexing.T[x]] for x in range(n_dim)], -1) 500 return corners
Returns coordinates of the opposite corners of the bounding box (hyperrectangle?) of the tile on padded data.
Arguments:
- tile_id (int): Specifies which tile's bounding coordinates will be returned. Must be between 0 and the total number of tiles.
- with_channel_dim (bool): Specifies whether to return shape with channel dimension or without. Default is False.
- all_corners (bool): If True, returns all vertices of the bounding box. Default is False.
Returns:
(np.ndarray, np.ndarray): Smallest (bottom-left) and largest (top-right) corners of the bounding box.
np.ndarray: All corners of the bounding box, if
all_corners=True
.
502 def get_tile_mosaic_position(self, tile_id: int, with_channel_dim: bool = False) -> np.ndarray: 503 """Returns tile position in the mosaic. 504 505 Args: 506 tile_id (int): Specifies which tile's mosaic position will be returned. \ 507 Must be smaller than the total number of tiles. 508 509 with_channel_dim (bool): Specifies whether to return position with channel dimension or without. 510 Default is False. 511 512 Returns: 513 np.ndarray: Tile mosaic position (tile position relative to other tiles). 514 """ 515 if (tile_id < 0) or (tile_id >= self.n_tiles): 516 raise IndexError( 517 f"Out of bounds, there is no tile {tile_id}. There are {len(self) - 1} tiles, starting from index 0." 518 ) 519 520 if self.channel_dimension is not None and not with_channel_dim: 521 return self._tile_index[tile_id][~(np.arange(self._n_dim) == self.channel_dimension)] 522 return self._tile_index[tile_id]
Returns tile position in the mosaic.
Arguments:
- tile_id (int): Specifies which tile's mosaic position will be returned. Must be smaller than the total number of tiles.
- with_channel_dim (bool): Specifies whether to return position with channel dimension or without. Default is False.
Returns:
np.ndarray: Tile mosaic position (tile position relative to other tiles).
524 def get_mosaic_shape(self, with_channel_dim: bool = False) -> np.ndarray: 525 """Returns mosaic shape. 526 527 Args: 528 with_channel_dim (bool): 529 Specifies whether to return shape with channel dimension or without. Defaults to False. 530 531 Returns: 532 np.ndarray: Shape of tiles mosaic. 533 """ 534 if self.channel_dimension is not None and not with_channel_dim: 535 return self._indexing_shape[~(np.arange(self._n_dim) == self.channel_dimension)] 536 return self._indexing_shape
Returns mosaic shape.
Arguments:
- with_channel_dim (bool): Specifies whether to return shape with channel dimension or without. Defaults to False.
Returns:
np.ndarray: Shape of tiles mosaic.
538 def calculate_padding(self) -> Tuple[np.ndarray, List[Tuple[int, int]]]: 539 """Calculate a frame padding for the current Tiler parameters. 540 The padding is overlap//2 or tile_step//2, whichever is bigger. 541 The method returns a tuple (new_shape, padding) where padding is 542 ((before_1, after_1), … (before_N, after_N)), unique pad widths for each axis N. 543 544 In the usual workflow, you'd recalculate tiling settings and then apply padding, prior to tiling. 545 Then when merging, pass padding to `Merger.merge(extra_padding=padding, ...)`: 546 ```python 547 >>> tiler = Tiler(...) 548 >>> merger = Merger(tiler, ...) 549 >>> new_shape, padding = tiler.calculate_padding() 550 >>> tiler.recalculate(data_shape=new_shape) 551 >>> padded_data = np.pad(data, pad_width=padding, mode="reflect") 552 >>> for tile_id, tile in tiler(padded_data): 553 >>> processed_tile = process(tile) 554 >>> merger.add(tile_id, processed_tile) 555 >>> final_image = merger.merge(extra_padding=padding) 556 ``` 557 Return: 558 np.ndarray: Resulting shape when padding is applied. 559 560 List[Tuple[int, int]]: Calculated padding. 561 """ 562 563 # Choosing padding 564 pre_pad = np.maximum(self._tile_step // 2, self._tile_overlap // 2) 565 post_pad = pre_pad + np.mod(self._tile_step, 2) 566 567 new_shape = pre_pad + self.data_shape + post_pad 568 padding = list(zip(pre_pad, post_pad, strict=False)) 569 570 return new_shape, padding
Calculate a frame padding for the current Tiler parameters. The padding is overlap//2 or tile_step//2, whichever is bigger. The method returns a tuple (new_shape, padding) where padding is ((before_1, after_1), … (before_N, after_N)), unique pad widths for each axis N.
In the usual workflow, you'd recalculate tiling settings and then apply padding, prior to tiling.
Then when merging, pass padding to Merger.merge(extra_padding=padding, ...)
:
>>> tiler = Tiler(...)
>>> merger = Merger(tiler, ...)
>>> new_shape, padding = tiler.calculate_padding()
>>> tiler.recalculate(data_shape=new_shape)
>>> padded_data = np.pad(data, pad_width=padding, mode="reflect")
>>> for tile_id, tile in tiler(padded_data):
>>> processed_tile = process(tile)
>>> merger.add(tile_id, processed_tile)
>>> final_image = merger.merge(extra_padding=padding)
Return:
np.ndarray: Resulting shape when padding is applied.
List[Tuple[int, int]]: Calculated padding.
12class Merger: 13 SUPPORTED_WINDOWS = [ 14 "boxcar", 15 "triang", 16 "blackman", 17 "hamming", 18 "hann", 19 "bartlett", 20 "parzen", 21 "bohman", 22 "blackmanharris", 23 "nuttall", 24 "barthann", 25 "overlap-tile", 26 ] 27 r""" 28 Supported windows: 29 - 'boxcar' (default) 30 Boxcar window: the weight of each is tile element is 1. 31 Also known as rectangular window or Dirichlet window (and equivalent to no window at all). 32 - 'triang' 33 Triangular window. 34 - 'blackman' 35 Blackman window. 36 - 'hamming' 37 Hamming window. 38 - 'hann' 39 Hann window. 40 - 'bartlett' 41 Bartlett window. 42 - 'parzen' 43 Parzen window. 44 - 'bohman' 45 Bohman window. 46 - 'blackmanharris' 47 Minimum 4-term Blackman-Harris window. 48 - 'nuttall' 49 Minimum 4-term Blackman-Harris window according to Nuttall. 50 - 'barthann' 51 Bartlett-Hann window. 52 - 'overlap-tile' 53 Creates a boxcar window for the non-overlapping, middle part of tile, and zeros everywhere else. 54 Requires applying padding calculated with `Tiler.calculate_padding()` for correct results. 55 (based on Ronneberger et al. 2015, U-Net paper) 56 """ 57 58 def __init__( 59 self, 60 tiler: Tiler, 61 window: Union[None, str, np.ndarray] = None, 62 logits: int = 0, 63 save_visits: bool = True, 64 data_dtype: npt.DTypeLike = np.float32, 65 weights_dtype: npt.DTypeLike = np.float32, 66 ): 67 """Merger holds cumulative result buffers for merging tiles created by a given Tiler 68 and the window function that is applied to added tiles. 69 70 There are two required np.float64 buffers: `self.data` and `self.weights_sum` 71 and one optional np.uint32 `self.data_visits` (see below `save_visits` argument). 72 73 Args: 74 tiler (Tiler): Tiler with which the tiles were originally created. 75 76 window (None, str or np.ndarray): Specifies which window function to use for tile merging. 77 Must be one of `Merger.SUPPORTED_WINDOWS` or a numpy array with the same size as the tile. 78 Default is None which creates a boxcar window (constant 1s). 79 80 logits (int): Specify whether to add logits dimensions in front of the data array. Default is `0`. 81 82 save_visits (bool): Specify whether to save which elements has been modified and how many times in 83 `self.data_visits`. Can be disabled to save some memory. Default is `True`. 84 85 data_dtype (np.dtype): Specify data type for data buffer that stores cumulative result. 86 Default is `np.float32`. 87 88 weights_dtype (np.dtype): Specify data type for weights buffer that stores cumulative weights and window. 89 If you don't need precision but would rather save memory you can use `np.float16`. 90 Likewise, on the opposite, you can use `np.float64`. 91 Default is `np.float32`. 92 93 """ 94 95 self.tiler = tiler 96 """@private""" 97 98 # Logits support 99 if not isinstance(logits, int) or logits < 0: 100 raise ValueError(f"Logits must be an integer 0 or a positive number ({logits}).") 101 self.logits = int(logits) 102 """@private""" 103 104 # Generate data and normalization arrays 105 self.data = self.data_visits = self.weights_sum = None 106 self.data_dtype = data_dtype 107 """@private""" 108 self.weights_dtype = weights_dtype 109 """@private""" 110 self.reset(save_visits) 111 112 # Generate window function 113 self.window = None 114 """@private""" 115 self.set_window(window) 116 117 def _generate_window(self, window: str, shape: Union[Tuple, List]) -> np.ndarray: 118 """Generate n-dimensional window according to the given shape. 119 Adapted from: https://stackoverflow.com/a/53588640/1668421 120 121 Args: 122 window (str): Specifies window function. Must be one of `Merger.SUPPORTED_WINDOWS`. 123 124 shape (tuple or list): Shape of the requested window. 125 126 Returns: 127 np.ndarray: n-dimensional window of the given shape and function 128 """ 129 130 w = np.ones(shape, dtype=self.weights_dtype) 131 overlap = self.tiler._tile_overlap 132 for axis, length in enumerate(shape): 133 if axis == self.tiler.channel_dimension: 134 # channel dimension should have weight of 1 everywhere 135 win = get_window("boxcar", length) 136 else: 137 if window == "overlap-tile": 138 axis_overlap = overlap[axis] // 2 139 win = np.zeros(length) 140 win[axis_overlap:-axis_overlap] = 1 141 else: 142 win = get_window(window, length) 143 144 for i in range(len(shape)): 145 if i == axis: 146 continue 147 else: 148 win = np.stack([win] * shape[i], axis=i) 149 150 w *= win.astype(self.weights_dtype) 151 152 return w 153 154 def set_window(self, window: Union[None, str, np.ndarray] = None) -> None: 155 """Sets window function depending on the given window function. 156 157 Args: 158 window (None, str or np.ndarray): Specifies which window function to use for tile merging. 159 Must be one of `Merger.SUPPORTED_WINDOWS` or a numpy array with the same size as the tile. 160 If passed None sets a boxcar window (constant 1s). 161 162 Returns: 163 None 164 """ 165 166 # Warn user that changing window type after some elements were already visited is a bad idea. 167 if np.count_nonzero(self.data_visits): 168 warnings.warn("You are setting window type after some elements were already added.", stacklevel=2) 169 170 # Default window is boxcar 171 if window is None: 172 window = "boxcar" 173 174 # Generate or set a window function 175 if isinstance(window, str): 176 if window not in self.SUPPORTED_WINDOWS: 177 raise ValueError("Unsupported window, please check docs") 178 self.window = self._generate_window(window, self.tiler.tile_shape) 179 elif isinstance(window, np.ndarray): 180 if not np.array_equal(window.shape, self.tiler.tile_shape): 181 raise ValueError("Window function must have the same shape as tile shape.") 182 self.window = window.astype(self.weights_dtype) 183 else: 184 raise ValueError(f"Unsupported type for window function ({type(window)}), expected str or np.ndarray.") 185 186 def reset(self, save_visits: bool = True) -> None: 187 """Reset data, weights and optional data_visits buffers. 188 189 Should be done after finishing merging full tile set and before starting processing the next tile set. 190 191 Args: 192 save_visits (bool): Specify whether to save which elements has been modified and how many times in 193 `self.data_visits`. Can be disabled to save some memory. Default is `True`. 194 195 Returns: 196 None 197 """ 198 199 padded_data_shape = self.tiler._new_shape 200 201 # Image holds sum of all processed tiles multiplied by the window 202 if self.logits: 203 self.data = np.zeros((self.logits, *padded_data_shape), dtype=self.data_dtype) 204 else: 205 self.data = np.zeros(padded_data_shape, dtype=self.data_dtype) 206 207 # Data visits holds the number of times each element was assigned 208 if save_visits: 209 self.data_visits = np.zeros(padded_data_shape, dtype=np.uint32) # uint32 ought to be enough for anyone :) 210 211 # Total data window (weight) coefficients 212 self.weights_sum = np.zeros(padded_data_shape, dtype=self.weights_dtype) 213 214 def add(self, tile_id: int, data: np.ndarray) -> None: 215 """Adds `tile_id`-th tile into Merger. 216 217 Args: 218 tile_id (int): Specifies which tile it is. 219 220 data (np.ndarray): Specifies tile data. 221 222 Returns: 223 None 224 """ 225 if tile_id < 0 or tile_id >= len(self.tiler): 226 raise IndexError( 227 f"Out of bounds, there is no tile {tile_id}. There are {len(self.tiler)} tiles, starting from index 0." 228 ) 229 230 data_shape = np.array(data.shape) 231 expected_tile_shape = ( 232 ((self.logits,) + tuple(self.tiler.tile_shape)) if self.logits > 0 else tuple(self.tiler.tile_shape) 233 ) 234 235 if self.tiler.mode != "irregular": 236 if not np.all(np.equal(data_shape, expected_tile_shape)): 237 raise ValueError( 238 f"Passed data shape ({data_shape}) does not fit expected tile shape ({expected_tile_shape})." 239 ) 240 else: 241 if not np.all(np.less_equal(data_shape, expected_tile_shape)): 242 raise ValueError( 243 f"Passed data shape ({data_shape}) must be less or equal than tile shape ({expected_tile_shape})." 244 ) 245 246 # Select coordinates for data 247 shape_diff = expected_tile_shape - data_shape 248 a, b = self.tiler.get_tile_bbox(tile_id, with_channel_dim=True) 249 250 sl = [slice(x, y - shape_diff[i]) for i, (x, y) in enumerate(zip(a, b, strict=False))] 251 win_sl = [slice(None, -diff) if (diff > 0) else slice(None, None) for diff in shape_diff] 252 253 if self.logits > 0: 254 self.data[tuple([slice(None, None, None)] + sl)] += data * self.window[tuple(win_sl[1:])] 255 self.weights_sum[tuple(sl)] += self.window[tuple(win_sl[1:])] 256 else: 257 self.data[tuple(sl)] += data * self.window[tuple(win_sl)] 258 self.weights_sum[tuple(sl)] += self.window[tuple(win_sl)] 259 260 if self.data_visits is not None: 261 self.data_visits[tuple(sl)] += 1 262 263 def add_batch(self, batch_id: int, batch_size: int, data: np.ndarray) -> None: 264 """Adds `batch_id`-th batch of `batch_size` tiles into Merger. 265 266 Args: 267 batch_id (int): Specifies batch number, must be >= 0. 268 269 batch_size (int): Specifies batch size, must be >= 0. 270 271 data (np.ndarray): Tile data array, must have shape `[batch, *tile_shape] 272 273 Returns: 274 None 275 """ 276 277 # calculate total number of batches 278 div, mod = np.divmod(len(self.tiler), batch_size) 279 n_batches = (div + 1) if mod > 0 else div 280 281 if batch_id < 0 or batch_id >= n_batches: 282 raise IndexError(f"Out of bounds. There are {n_batches} batches of {batch_size}, starting from index 0.") 283 284 # add each tile in a batch with computed tile_id 285 for data_i, tile_i in enumerate( 286 range(batch_id * batch_size, min((batch_id + 1) * batch_size, len(self.tiler))) 287 ): 288 self.add(tile_i, data[data_i]) 289 290 def _unpad(self, data: np.ndarray, extra_padding: Optional[List[Tuple[int, int]]] = None): 291 """Slices/unpads data according to merger and tiler settings, as well as additional padding. 292 293 Args: 294 data (np.ndarray): Data to be sliced. 295 296 extra_padding (tuple of tuples of two ints, optional): Specifies padding that was applied to the data. 297 Number of values padded to the edges of each axis. 298 ((before_1, after_1), … (before_N, after_N)) unique pad widths for each axis. 299 Default is None. 300 """ 301 if extra_padding: 302 sl = [ 303 slice(pad_from, shape - pad_to) 304 for shape, (pad_from, pad_to) in zip(self.tiler.data_shape, extra_padding, strict=False) 305 ] 306 else: 307 sl = [slice(None, self.tiler.data_shape[i]) for i in range(len(self.tiler.data_shape))] 308 309 # if merger has logits dimension, add another slicing in front 310 if self.logits: 311 sl = [slice(None, None, None)] + sl 312 313 return data[tuple(sl)] 314 315 def merge( 316 self, 317 unpad: bool = True, 318 extra_padding: Optional[List[Tuple[int, int]]] = None, 319 argmax: bool = False, 320 normalize_by_weights: bool = True, 321 dtype: Optional[npt.DTypeLike] = None, 322 ) -> np.ndarray: 323 """Returns merged data array obtained from added tiles. 324 325 Args: 326 unpad (bool): If unpad is True, removes padded array elements. Default is True. 327 328 extra_padding (tuple of tuples of two ints, optional): Specifies padding that was applied to the data. 329 Number of values padded to the edges of each axis. 330 ((before_1, after_1), … (before_N, after_N)) unique pad widths for each axis. 331 Default is None. 332 333 argmax (bool): If argmax is True, the first dimension will be argmaxed. 334 Useful when merger is initialized with `logits=True`. 335 Default is False. 336 337 normalize_by_weights (bool): If normalize is True, the accumulated data will be divided by weights. 338 Default is True. 339 340 dtype (np.dtype, optional): Specify dtype for the final merged output. 341 If None, uses `data_dtype` specified when Merger was initialized. 342 Default is None. 343 344 Returns: 345 np.ndarray: Final merged data array obtained from added tiles. 346 """ 347 348 data = self.data 349 350 if normalize_by_weights: 351 # ignoring division by zero 352 # alternatively, set values < atol to 1 353 # https://github.com/the-lay/tiler/blob/46e948bb2bd7a909e954baf87a0c15b384109fde/tiler/merger.py#L314 354 # TODO check which way is better 355 # ignoring should be more precise without atol 356 # but can hide other errors 357 with np.errstate(divide="ignore", invalid="ignore"): 358 data = np.nan_to_num(data / self.weights_sum) 359 360 if unpad: 361 data = self._unpad(data, extra_padding) 362 363 if argmax: 364 data = np.argmax(data, 0) 365 366 if dtype is not None: 367 return data.astype(dtype) 368 else: 369 return data.astype(self.data_dtype)
58 def __init__( 59 self, 60 tiler: Tiler, 61 window: Union[None, str, np.ndarray] = None, 62 logits: int = 0, 63 save_visits: bool = True, 64 data_dtype: npt.DTypeLike = np.float32, 65 weights_dtype: npt.DTypeLike = np.float32, 66 ): 67 """Merger holds cumulative result buffers for merging tiles created by a given Tiler 68 and the window function that is applied to added tiles. 69 70 There are two required np.float64 buffers: `self.data` and `self.weights_sum` 71 and one optional np.uint32 `self.data_visits` (see below `save_visits` argument). 72 73 Args: 74 tiler (Tiler): Tiler with which the tiles were originally created. 75 76 window (None, str or np.ndarray): Specifies which window function to use for tile merging. 77 Must be one of `Merger.SUPPORTED_WINDOWS` or a numpy array with the same size as the tile. 78 Default is None which creates a boxcar window (constant 1s). 79 80 logits (int): Specify whether to add logits dimensions in front of the data array. Default is `0`. 81 82 save_visits (bool): Specify whether to save which elements has been modified and how many times in 83 `self.data_visits`. Can be disabled to save some memory. Default is `True`. 84 85 data_dtype (np.dtype): Specify data type for data buffer that stores cumulative result. 86 Default is `np.float32`. 87 88 weights_dtype (np.dtype): Specify data type for weights buffer that stores cumulative weights and window. 89 If you don't need precision but would rather save memory you can use `np.float16`. 90 Likewise, on the opposite, you can use `np.float64`. 91 Default is `np.float32`. 92 93 """ 94 95 self.tiler = tiler 96 """@private""" 97 98 # Logits support 99 if not isinstance(logits, int) or logits < 0: 100 raise ValueError(f"Logits must be an integer 0 or a positive number ({logits}).") 101 self.logits = int(logits) 102 """@private""" 103 104 # Generate data and normalization arrays 105 self.data = self.data_visits = self.weights_sum = None 106 self.data_dtype = data_dtype 107 """@private""" 108 self.weights_dtype = weights_dtype 109 """@private""" 110 self.reset(save_visits) 111 112 # Generate window function 113 self.window = None 114 """@private""" 115 self.set_window(window)
Merger holds cumulative result buffers for merging tiles created by a given Tiler and the window function that is applied to added tiles.
There are two required np.float64 buffers: self.data
and self.weights_sum
and one optional np.uint32 self.data_visits
(see below save_visits
argument).
Arguments:
- tiler (Tiler): Tiler with which the tiles were originally created.
- window (None, str or np.ndarray): Specifies which window function to use for tile merging.
Must be one of
Merger.SUPPORTED_WINDOWS
or a numpy array with the same size as the tile. Default is None which creates a boxcar window (constant 1s). - logits (int): Specify whether to add logits dimensions in front of the data array. Default is
0
. - save_visits (bool): Specify whether to save which elements has been modified and how many times in
self.data_visits
. Can be disabled to save some memory. Default isTrue
. - data_dtype (np.dtype): Specify data type for data buffer that stores cumulative result.
Default is
np.float32
. - weights_dtype (np.dtype): Specify data type for weights buffer that stores cumulative weights and window.
If you don't need precision but would rather save memory you can use
np.float16
. Likewise, on the opposite, you can usenp.float64
. Default isnp.float32
.
Supported windows:
- 'boxcar' (default) Boxcar window: the weight of each is tile element is 1. Also known as rectangular window or Dirichlet window (and equivalent to no window at all).
- 'triang' Triangular window.
- 'blackman' Blackman window.
- 'hamming' Hamming window.
- 'hann' Hann window.
- 'bartlett' Bartlett window.
- 'parzen' Parzen window.
- 'bohman' Bohman window.
- 'blackmanharris' Minimum 4-term Blackman-Harris window.
- 'nuttall' Minimum 4-term Blackman-Harris window according to Nuttall.
- 'barthann' Bartlett-Hann window.
- 'overlap-tile'
Creates a boxcar window for the non-overlapping, middle part of tile, and zeros everywhere else.
Requires applying padding calculated with
Tiler.calculate_padding()
for correct results. (based on Ronneberger et al. 2015, U-Net paper)
154 def set_window(self, window: Union[None, str, np.ndarray] = None) -> None: 155 """Sets window function depending on the given window function. 156 157 Args: 158 window (None, str or np.ndarray): Specifies which window function to use for tile merging. 159 Must be one of `Merger.SUPPORTED_WINDOWS` or a numpy array with the same size as the tile. 160 If passed None sets a boxcar window (constant 1s). 161 162 Returns: 163 None 164 """ 165 166 # Warn user that changing window type after some elements were already visited is a bad idea. 167 if np.count_nonzero(self.data_visits): 168 warnings.warn("You are setting window type after some elements were already added.", stacklevel=2) 169 170 # Default window is boxcar 171 if window is None: 172 window = "boxcar" 173 174 # Generate or set a window function 175 if isinstance(window, str): 176 if window not in self.SUPPORTED_WINDOWS: 177 raise ValueError("Unsupported window, please check docs") 178 self.window = self._generate_window(window, self.tiler.tile_shape) 179 elif isinstance(window, np.ndarray): 180 if not np.array_equal(window.shape, self.tiler.tile_shape): 181 raise ValueError("Window function must have the same shape as tile shape.") 182 self.window = window.astype(self.weights_dtype) 183 else: 184 raise ValueError(f"Unsupported type for window function ({type(window)}), expected str or np.ndarray.")
Sets window function depending on the given window function.
Arguments:
- window (None, str or np.ndarray): Specifies which window function to use for tile merging.
Must be one of
Merger.SUPPORTED_WINDOWS
or a numpy array with the same size as the tile. If passed None sets a boxcar window (constant 1s).
Returns:
None
186 def reset(self, save_visits: bool = True) -> None: 187 """Reset data, weights and optional data_visits buffers. 188 189 Should be done after finishing merging full tile set and before starting processing the next tile set. 190 191 Args: 192 save_visits (bool): Specify whether to save which elements has been modified and how many times in 193 `self.data_visits`. Can be disabled to save some memory. Default is `True`. 194 195 Returns: 196 None 197 """ 198 199 padded_data_shape = self.tiler._new_shape 200 201 # Image holds sum of all processed tiles multiplied by the window 202 if self.logits: 203 self.data = np.zeros((self.logits, *padded_data_shape), dtype=self.data_dtype) 204 else: 205 self.data = np.zeros(padded_data_shape, dtype=self.data_dtype) 206 207 # Data visits holds the number of times each element was assigned 208 if save_visits: 209 self.data_visits = np.zeros(padded_data_shape, dtype=np.uint32) # uint32 ought to be enough for anyone :) 210 211 # Total data window (weight) coefficients 212 self.weights_sum = np.zeros(padded_data_shape, dtype=self.weights_dtype)
Reset data, weights and optional data_visits buffers.
Should be done after finishing merging full tile set and before starting processing the next tile set.
Arguments:
- save_visits (bool): Specify whether to save which elements has been modified and how many times in
self.data_visits
. Can be disabled to save some memory. Default isTrue
.
Returns:
None
214 def add(self, tile_id: int, data: np.ndarray) -> None: 215 """Adds `tile_id`-th tile into Merger. 216 217 Args: 218 tile_id (int): Specifies which tile it is. 219 220 data (np.ndarray): Specifies tile data. 221 222 Returns: 223 None 224 """ 225 if tile_id < 0 or tile_id >= len(self.tiler): 226 raise IndexError( 227 f"Out of bounds, there is no tile {tile_id}. There are {len(self.tiler)} tiles, starting from index 0." 228 ) 229 230 data_shape = np.array(data.shape) 231 expected_tile_shape = ( 232 ((self.logits,) + tuple(self.tiler.tile_shape)) if self.logits > 0 else tuple(self.tiler.tile_shape) 233 ) 234 235 if self.tiler.mode != "irregular": 236 if not np.all(np.equal(data_shape, expected_tile_shape)): 237 raise ValueError( 238 f"Passed data shape ({data_shape}) does not fit expected tile shape ({expected_tile_shape})." 239 ) 240 else: 241 if not np.all(np.less_equal(data_shape, expected_tile_shape)): 242 raise ValueError( 243 f"Passed data shape ({data_shape}) must be less or equal than tile shape ({expected_tile_shape})." 244 ) 245 246 # Select coordinates for data 247 shape_diff = expected_tile_shape - data_shape 248 a, b = self.tiler.get_tile_bbox(tile_id, with_channel_dim=True) 249 250 sl = [slice(x, y - shape_diff[i]) for i, (x, y) in enumerate(zip(a, b, strict=False))] 251 win_sl = [slice(None, -diff) if (diff > 0) else slice(None, None) for diff in shape_diff] 252 253 if self.logits > 0: 254 self.data[tuple([slice(None, None, None)] + sl)] += data * self.window[tuple(win_sl[1:])] 255 self.weights_sum[tuple(sl)] += self.window[tuple(win_sl[1:])] 256 else: 257 self.data[tuple(sl)] += data * self.window[tuple(win_sl)] 258 self.weights_sum[tuple(sl)] += self.window[tuple(win_sl)] 259 260 if self.data_visits is not None: 261 self.data_visits[tuple(sl)] += 1
Adds tile_id
-th tile into Merger.
Arguments:
- tile_id (int): Specifies which tile it is.
- data (np.ndarray): Specifies tile data.
Returns:
None
263 def add_batch(self, batch_id: int, batch_size: int, data: np.ndarray) -> None: 264 """Adds `batch_id`-th batch of `batch_size` tiles into Merger. 265 266 Args: 267 batch_id (int): Specifies batch number, must be >= 0. 268 269 batch_size (int): Specifies batch size, must be >= 0. 270 271 data (np.ndarray): Tile data array, must have shape `[batch, *tile_shape] 272 273 Returns: 274 None 275 """ 276 277 # calculate total number of batches 278 div, mod = np.divmod(len(self.tiler), batch_size) 279 n_batches = (div + 1) if mod > 0 else div 280 281 if batch_id < 0 or batch_id >= n_batches: 282 raise IndexError(f"Out of bounds. There are {n_batches} batches of {batch_size}, starting from index 0.") 283 284 # add each tile in a batch with computed tile_id 285 for data_i, tile_i in enumerate( 286 range(batch_id * batch_size, min((batch_id + 1) * batch_size, len(self.tiler))) 287 ): 288 self.add(tile_i, data[data_i])
Adds batch_id
-th batch of batch_size
tiles into Merger.
Arguments:
- batch_id (int): Specifies batch number, must be >= 0.
- batch_size (int): Specifies batch size, must be >= 0.
- data (np.ndarray): Tile data array, must have shape `[batch, *tile_shape]
Returns:
None
315 def merge( 316 self, 317 unpad: bool = True, 318 extra_padding: Optional[List[Tuple[int, int]]] = None, 319 argmax: bool = False, 320 normalize_by_weights: bool = True, 321 dtype: Optional[npt.DTypeLike] = None, 322 ) -> np.ndarray: 323 """Returns merged data array obtained from added tiles. 324 325 Args: 326 unpad (bool): If unpad is True, removes padded array elements. Default is True. 327 328 extra_padding (tuple of tuples of two ints, optional): Specifies padding that was applied to the data. 329 Number of values padded to the edges of each axis. 330 ((before_1, after_1), … (before_N, after_N)) unique pad widths for each axis. 331 Default is None. 332 333 argmax (bool): If argmax is True, the first dimension will be argmaxed. 334 Useful when merger is initialized with `logits=True`. 335 Default is False. 336 337 normalize_by_weights (bool): If normalize is True, the accumulated data will be divided by weights. 338 Default is True. 339 340 dtype (np.dtype, optional): Specify dtype for the final merged output. 341 If None, uses `data_dtype` specified when Merger was initialized. 342 Default is None. 343 344 Returns: 345 np.ndarray: Final merged data array obtained from added tiles. 346 """ 347 348 data = self.data 349 350 if normalize_by_weights: 351 # ignoring division by zero 352 # alternatively, set values < atol to 1 353 # https://github.com/the-lay/tiler/blob/46e948bb2bd7a909e954baf87a0c15b384109fde/tiler/merger.py#L314 354 # TODO check which way is better 355 # ignoring should be more precise without atol 356 # but can hide other errors 357 with np.errstate(divide="ignore", invalid="ignore"): 358 data = np.nan_to_num(data / self.weights_sum) 359 360 if unpad: 361 data = self._unpad(data, extra_padding) 362 363 if argmax: 364 data = np.argmax(data, 0) 365 366 if dtype is not None: 367 return data.astype(dtype) 368 else: 369 return data.astype(self.data_dtype)
Returns merged data array obtained from added tiles.
Arguments:
- unpad (bool): If unpad is True, removes padded array elements. Default is True.
- extra_padding (tuple of tuples of two ints, optional): Specifies padding that was applied to the data. Number of values padded to the edges of each axis. ((before_1, after_1), … (before_N, after_N)) unique pad widths for each axis. Default is None.
- argmax (bool): If argmax is True, the first dimension will be argmaxed.
Useful when merger is initialized with
logits=True
. Default is False. - normalize_by_weights (bool): If normalize is True, the accumulated data will be divided by weights. Default is True.
- dtype (np.dtype, optional): Specify dtype for the final merged output.
If None, uses
data_dtype
specified when Merger was initialized. Default is None.
Returns:
np.ndarray: Final merged data array obtained from added tiles.