from layers02 import *
from connectome import Chain
source = HeLa(root='DIC-C2DH-HeLa')
key = source.ids[0]
dataset = Chain(
source,
Binarize(),
Zoom(factor=0.25),
Crop(),
)
x, y = dataset.image(key), dataset.mask(key)
import matplotlib.pyplot as plt
plt.subplot(1, 2, 1)
plt.imshow(x, cmap='gray')
plt.axis('off')
plt.subplot(1, 2, 2)
plt.imshow(y, cmap='gray')
plt.axis('off');
Data Augmentation¶
Data augmentation is a useful technique to increase the dataset size by applying random transformation to the data.
So far we only used pure functions in our transforms: their output only depends on the input arguments. connectome
heavily relies on this property because impure functions (e.g. that generate random value) might cause a lot of trouble in your pipelines.
However impure functions are sometimes very useful, and we can use them in our transforms with the impure
decorator. Let's write a layer that randomly rotates the image and the mask:
from connectome import impure
import numpy as np
from skimage.transform import rotate
class Rotate(Transform):
@impure
def _degrees():
return np.random.uniform(0, 360)
def image(image, _degrees):
return rotate(image, _degrees, mode='reflect', order=1)
def mask(mask, _degrees):
return rotate(mask.astype(float), _degrees, mode='reflect', order=1) >= 0.5
We already saw such transforms, except the impure
decorator. It is important to annotate all impure functions with it, because this allows connectome
to perform additional checks which will help you not to shoot yourself in the leg.
augmented = dataset >> Rotate()
x, y = augmented.image(key), augmented.mask(key)
plt.subplot(1, 2, 1)
plt.imshow(x, cmap='gray')
plt.axis('off')
plt.subplot(1, 2, 2)
plt.imshow(y, cmap='gray')
plt.axis('off');
Wait, something is wrong. The image and the mask are not consistent anymore! The problem is in this line:
x, y = augmented.image(key), augmented.mask(key)
We are making separate calls to image
and mask
. There is no way for connectome
to figure out that they must be consistent. We need to compute them both in a single call, that's how you do it:
loader = augmented._compile(['image', 'mask'])
We created a new function, that returns a tuple of (image, mask)
pairs:
x, y = loader(key)
plt.subplot(1, 2, 1)
plt.imshow(x, cmap='gray')
plt.axis('off')
plt.subplot(1, 2, 2)
plt.imshow(y, cmap='gray')
plt.axis('off');
That's more like it. The image and mask are consistent again, but each time different:
x, y = loader(key)
plt.subplot(1, 2, 1)
plt.imshow(x, cmap='gray')
plt.axis('off')
plt.subplot(1, 2, 2)
plt.imshow(y, cmap='gray')
plt.axis('off');
x, y = loader(key)
plt.subplot(1, 2, 1)
plt.imshow(x, cmap='gray')
plt.axis('off')
plt.subplot(1, 2, 2)
plt.imshow(y, cmap='gray')
plt.axis('off');
Image noising¶
Until now we only used impure
for internal parameters, but we can go further and make outputs impure:
class NormalNoise(Transform):
__inherit__ = 'mask'
@impure
def image(image):
return image + np.random.normal(scale=10, size=image.shape)
augmented = dataset >> Rotate() >> NormalNoise()
loader = augmented._compile(['image', 'mask'])
x, y = loader(key)
plt.subplot(1, 2, 1)
plt.imshow(x, cmap='gray')
plt.axis('off')
plt.subplot(1, 2, 2)
plt.imshow(y, cmap='gray')
plt.axis('off');
Maybe that's too much noise, let's move the scale
parameter out:
class NormalNoise(Transform):
__inherit__ = 'mask'
_scale: float = 1
@impure
def image(image, _scale):
return image + np.random.normal(scale=_scale, size=image.shape)
augmented = dataset >> Rotate() >> NormalNoise(scale=3)
loader = augmented._compile(['image', 'mask'])
x, y = loader(key)
plt.subplot(1, 2, 1)
plt.imshow(x, cmap='gray')
plt.axis('off')
plt.subplot(1, 2, 2)
plt.imshow(y, cmap='gray')
plt.axis('off');
Note that the _scale
parameter has a default value of 1
Note how both the image and the mask got cropped. This is data consistency at work!
That's all for now.
The main points to remember while working with data augmentation:
- Annotate your impure functions with
impure
- Use
._compile
to make a function, which loads you data in a single call
See you next time!