image_processing package#

Submodules#

image_processing.combination module#

Combination functions for aggregating per-orientation convolution responses.

A combination function receives the full (C, N, H, W) response tensor produced by applying N oriented kernels to a C-channel image and reduces it to a (H, W) edge-strength map.

All built-in functions expect the response to have at least two dimensions (C, N, ...); additional spatial dimensions are preserved.

type image_processing.combination.CombineFn = Callable[[Tensor], Tensor]#
image_processing.combination.max_abs(response)[source]#

Return the pixel-wise maximum absolute response over channels and orientations.

Parameters:

response (Tensor) – Shape (C, N, H, W).

Returns:

Shape (H, W).

Return type:

Tensor

Examples

>>> import torch
>>> from image_processing.combination import max_abs
>>> r = torch.randn(3, 10, 64, 64)
>>> max_abs(r).shape
torch.Size([64, 64])
image_processing.combination.sum_of_abs(response)[source]#

Sum absolute responses over all colour channels and kernel orientations.

Parameters:

response (Tensor) – Shape (C, N, H, W).

Returns:

Shape (H, W).

Return type:

Tensor

Examples

>>> import torch
>>> from image_processing.combination import sum_of_abs
>>> r = torch.randn(3, 10, 64, 64)
>>> sum_of_abs(r).shape
torch.Size([64, 64])
image_processing.combination.sum_of_powers(power)[source]#

Return a combination function that sums |response|^power.

Higher powers emphasise strong responses and suppress weak ones, which sharpens detected edges at the cost of sensitivity. power=1 is equivalent to sum_of_abs(); power=2 is equivalent to sum_of_squares().

Parameters:

power (float) – Exponent applied to the absolute response before summation.

Returns:

A callable with signature (C, N, H, W) (H, W).

Return type:

Callable[[Tensor], Tensor]

Examples

>>> import torch
>>> from image_processing.combination import sum_of_powers
>>> fn = sum_of_powers(3.0)
>>> fn(torch.randn(3, 10, 64, 64)).shape
torch.Size([64, 64])
image_processing.combination.sum_of_squares(response)[source]#

Sum squared responses over all colour channels and kernel orientations.

Parameters:

response (Tensor) – Shape (C, N, H, W).

Returns:

Shape (H, W).

Return type:

Tensor

Examples

>>> import torch
>>> from image_processing.combination import sum_of_squares
>>> r = torch.randn(3, 10, 64, 64)
>>> sum_of_squares(r).shape
torch.Size([64, 64])

image_processing.detector module#

Edge detector that applies a kernel stack to an image and combines responses.

class image_processing.detector.EdgeDetector(kernel=None, combine_fn=None, normalize=True)[source]#

Bases: object

Apply a stack of convolution kernels to detect edges in an image.

The detector runs the following steps:

  1. Moves the input image to the kernel’s device and casts it to float32 if needed.

  2. Applies all N kernel orientations to each of the C colour channels via torch.nn.functional.conv2d() with padding='same', producing a (C, N, H, W) response tensor.

  3. Passes the response tensor to combine_fn to obtain a (H, W) edge-strength map.

  4. Optionally normalises the map to [0, 1].

The output spatial size always matches the input spatial size.

Parameters:
  • kernel (BaseKernel | None) – Provider of the oriented kernel stack. Defaults to ElongatedMaskKernel with default parameters.

  • combine_fn (Callable[[Tensor], Tensor] | None) – Callable (C, N, H, W) (H, W) that aggregates the response tensor. Defaults to sum_of_squares().

  • normalize (bool) – When True (default), divides the edge map by its maximum so the output lies in [0, 1]. Has no effect when the map is all zeros.

Examples

Minimal usage with default parameters:

>>> import torch
>>> from image_processing import EdgeDetector
>>> detector = EdgeDetector()
>>> image = torch.rand(3, 256, 256)        # (C, H, W) float image
>>> edges = detector.detect(image)
>>> edges.shape
torch.Size([256, 256])

Custom kernel and combination function:

>>> from image_processing import (
...     EdgeDetector,
...     ElongatedMaskKernel,
...     ElongatedMaskParams,
... )
>>> from image_processing.combination import sum_of_powers
>>> params = ElongatedMaskParams(
...     n_angles=18,
...     kernel_half_size=30,
...     length_falloff=0.1,
...     width_falloff=1.0,
... )
>>> detector = EdgeDetector(
...     kernel=ElongatedMaskKernel(params, device="cpu"),
...     combine_fn=sum_of_powers(3.0),
... )
>>> detector.detect(torch.rand(3, 128, 128)).shape
torch.Size([128, 128])
detect(image)[source]#

Detect edges in a single image.

Parameters:

image (Tensor) – Input image of shape (C, H, W) or (H, W). Grayscale inputs (H, W) are expanded to (1, H, W) automatically. Non-float tensors are cast to float32.

Returns:

Edge map of shape (H, W) on the kernel’s device.

Return type:

Tensor

Raises:

ValueError – If image is not 2-D or 3-D.

Examples

>>> import torch
>>> from image_processing import EdgeDetector
>>> EdgeDetector().detect(torch.rand(3, 64, 64)).shape
torch.Size([64, 64])
detect_batch(images)[source]#

Detect edges in a list of images.

Images may have different spatial sizes; each is processed independently via detect().

Parameters:

images (list[Tensor]) – Each element is a (C, H, W) or (H, W) tensor.

Returns:

One (H, W) edge map per input image.

Return type:

list[Tensor]

Examples

>>> import torch
>>> from image_processing import EdgeDetector
>>> imgs = [torch.rand(3, 64, 64), torch.rand(3, 128, 96)]
>>> [m.shape for m in EdgeDetector().detect_batch(imgs)]
[torch.Size([64, 64]), torch.Size([128, 96])]

image_processing.kernels module#

Convolution kernel classes for edge detection.

class image_processing.kernels.BaseKernel(params, device=None)[source]#

Bases: ABC

Abstract base class for edge-detection convolution kernels.

Concrete subclasses implement build() to construct a (N, kH, kW) kernel stack on self.device. The result is cached after the first access and can be discarded with reset().

Parameters:
  • params (BaseKernelParams) – Kernel configuration.

  • device (device | str | None) – Computation device. Defaults to CUDA when available, otherwise CPU.

Examples

Minimal subclass skeleton:

class MyKernel(BaseKernel):
    def build(self) -> torch.Tensor:
        # return tensor of shape (n_kernels, kH, kW) on self.device
        ...
abstractmethod build()[source]#

Build and return the kernel stack.

Returns:

Shape (n_kernels, kH, kW) on self.device.

Return type:

Tensor

property kernels: Tensor#

Return the lazily built and cached kernel tensor.

Returns:

Shape (n_kernels, kH, kW).

Return type:

torch.Tensor

reset()[source]#

Invalidate the cached kernels.

The next access to kernels triggers a fresh call to build().

Return type:

None

class image_processing.kernels.ElongatedMaskKernel(params=None, device=None)[source]#

Bases: BaseKernel

Rotated elongated-stripe convolution kernels for edge detection.

Implements the method from Contour-texture separation: part 2 (Antal, 2024). The construction proceeds as follows:

  1. A horizontal stripe of decaying values is placed on a blank square canvas. Weights fall off along the stripe length as 1 / (length_falloff * |j| + 1) and optionally across the width as 1 / (width_falloff * i + 1).

  2. The canvas is converted to a PIL image and rotated to n_angles orientations evenly distributed in [0°, 180°).

  3. Each rotated mask is anti-symmetrised by subtracting its 180° rotation (mirror flip). The resulting kernels detect signed intensity gradients perpendicular to the stripe.

With padding='same' the convolution output preserves the spatial size of the input image.

Parameters:
  • params (ElongatedMaskParams | None) – Kernel configuration. Uses default ElongatedMaskParams when None.

  • device (device | str | None) – Computation device. Defaults to CUDA when available, otherwise CPU.

Examples

Default configuration (10 angles, 41 x 41 kernel):

>>> from image_processing import ElongatedMaskKernel
>>> kernel = ElongatedMaskKernel()
>>> kernel.kernels.shape
torch.Size([10, 41, 41])

GPU-tuned configuration matching the notebook’s GPU example:

>>> from image_processing import ElongatedMaskKernel, ElongatedMaskParams
>>> params = ElongatedMaskParams(
...     n_angles=18,
...     kernel_half_size=30,
...     stripe_half_width=5,
...     stripe_half_length=30,
...     length_falloff=0.1,
...     width_falloff=1.0,
... )
>>> kernel = ElongatedMaskKernel(params, device="cpu")
>>> kernel.kernels.shape
torch.Size([18, 61, 61])
build()[source]#

Build the anti-symmetrised rotated stripe kernel stack.

Returns:

Shape (n_angles, 2 * kernel_half_size + 1, 2 * kernel_half_size + 1) on self.device.

Return type:

Tensor

Raises:

ValueError – If the stripe dimensions exceed kernel_half_size.

image_processing.params module#

Parameter dataclasses for edge detection kernels.

class image_processing.params.BaseKernelParams(n_angles=10, kernel_half_size=20)[source]#

Bases: object

Base parameters shared by all convolution kernel families.

Parameters:
  • n_angles (int) – Number of kernel orientations, evenly spaced in [0°, 180°).

  • kernel_half_size (int) – Half-size of the square kernel canvas. The full kernel will be (2 * kernel_half_size) x (2 * kernel_half_size) pixels.

kernel_half_size: int = 20#
n_angles: int = 10#
class image_processing.params.ElongatedMaskParams(n_angles=10, kernel_half_size=20, stripe_half_width=5, stripe_half_length=20, length_falloff=0.05, width_falloff=0.0)[source]#

Bases: BaseKernelParams

Parameters for the elongated directional stripe kernel.

The base mask is a horizontal stripe whose pixel values decay along its length and optionally across its width. The stripe is rotated to n_angles orientations and anti-symmetrised by subtracting its own 180° rotation, making each oriented kernel respond to signed intensity gradients perpendicular to the stripe direction.

Parameters:
  • stripe_half_width (int) – Number of pixel rows occupied by the stripe measured from the centre row (total stripe width = stripe_half_width rows).

  • stripe_half_length (int) – Half-length of the stripe along its axis. Column offsets span -stripe_half_length to stripe_half_length - 1.

  • length_falloff (float) – Decay coefficient along the stripe axis. The weight at column offset j is 1 / (length_falloff * |j| + 1). Set to 0.0 for a uniform-weight stripe.

  • width_falloff (float) – Decay coefficient across the stripe. The weight at row offset i is multiplied by 1 / (width_falloff * i + 1). Set to 0.0 (default) for uniform weight across all rows.

length_falloff: float = 0.05#
stripe_half_length: int = 20#
stripe_half_width: int = 5#
width_falloff: float = 0.0#

Module contents#

Image processing package for GPU-accelerated edge detection.

Implements the elongated-mask edge detection method from Contour-texture separation: part 2 (Antal, 2024) as an extendable, GPU-aware PyTorch package.

Quick start#

>>> import torch
>>> from image_processing import EdgeDetector
>>> detector = EdgeDetector()                    # uses GPU if available
>>> image = torch.rand(3, 512, 512)              # (C, H, W) float image in [0, 1]
>>> edges = detector.detect(image)               # (H, W) edge map in [0, 1]

Customising the kernel#

>>> from image_processing import (
...     EdgeDetector,
...     ElongatedMaskKernel,
...     ElongatedMaskParams,
... )
>>> from image_processing.combination import sum_of_powers
>>> params = ElongatedMaskParams(
...     n_angles=18,
...     kernel_half_size=30,
...     stripe_half_width=5,
...     stripe_half_length=30,
...     length_falloff=0.1,
...     width_falloff=1.0,
... )
>>> detector = EdgeDetector(
...     kernel=ElongatedMaskKernel(params),
...     combine_fn=sum_of_powers(3.0),
... )

Extending with a custom kernel#

Subclass BaseKernel and pair it with a custom BaseKernelParams dataclass:

from dataclasses import dataclass
import torch
from image_processing import BaseKernel, BaseKernelParams

@dataclass
class MyParams(BaseKernelParams):
    sigma: float = 1.0

class MyKernel(BaseKernel):
    def build(self) -> torch.Tensor:
        p: MyParams = self.params           # type: ignore[assignment]
        # ... build and return (N, kH, kW) tensor on self.device

Public API#

class image_processing.BaseKernel(params, device=None)[source]#

Bases: ABC

Abstract base class for edge-detection convolution kernels.

Concrete subclasses implement build() to construct a (N, kH, kW) kernel stack on self.device. The result is cached after the first access and can be discarded with reset().

Parameters:
  • params (BaseKernelParams) – Kernel configuration.

  • device (device | str | None) – Computation device. Defaults to CUDA when available, otherwise CPU.

Examples

Minimal subclass skeleton:

class MyKernel(BaseKernel):
    def build(self) -> torch.Tensor:
        # return tensor of shape (n_kernels, kH, kW) on self.device
        ...
abstractmethod build()[source]#

Build and return the kernel stack.

Returns:

Shape (n_kernels, kH, kW) on self.device.

Return type:

Tensor

property kernels: Tensor#

Return the lazily built and cached kernel tensor.

Returns:

Shape (n_kernels, kH, kW).

Return type:

torch.Tensor

reset()[source]#

Invalidate the cached kernels.

The next access to kernels triggers a fresh call to build().

Return type:

None

class image_processing.BaseKernelParams(n_angles=10, kernel_half_size=20)[source]#

Bases: object

Base parameters shared by all convolution kernel families.

Parameters:
  • n_angles (int) – Number of kernel orientations, evenly spaced in [0°, 180°).

  • kernel_half_size (int) – Half-size of the square kernel canvas. The full kernel will be (2 * kernel_half_size) x (2 * kernel_half_size) pixels.

kernel_half_size: int = 20#
n_angles: int = 10#
type image_processing.CombineFn = Callable[[Tensor], Tensor]#
class image_processing.EdgeDetector(kernel=None, combine_fn=None, normalize=True)[source]#

Bases: object

Apply a stack of convolution kernels to detect edges in an image.

The detector runs the following steps:

  1. Moves the input image to the kernel’s device and casts it to float32 if needed.

  2. Applies all N kernel orientations to each of the C colour channels via torch.nn.functional.conv2d() with padding='same', producing a (C, N, H, W) response tensor.

  3. Passes the response tensor to combine_fn to obtain a (H, W) edge-strength map.

  4. Optionally normalises the map to [0, 1].

The output spatial size always matches the input spatial size.

Parameters:
  • kernel (BaseKernel | None) – Provider of the oriented kernel stack. Defaults to ElongatedMaskKernel with default parameters.

  • combine_fn (CombineFn | None) – Callable (C, N, H, W) (H, W) that aggregates the response tensor. Defaults to sum_of_squares().

  • normalize (bool) – When True (default), divides the edge map by its maximum so the output lies in [0, 1]. Has no effect when the map is all zeros.

Examples

Minimal usage with default parameters:

>>> import torch
>>> from image_processing import EdgeDetector
>>> detector = EdgeDetector()
>>> image = torch.rand(3, 256, 256)        # (C, H, W) float image
>>> edges = detector.detect(image)
>>> edges.shape
torch.Size([256, 256])

Custom kernel and combination function:

>>> from image_processing import (
...     EdgeDetector,
...     ElongatedMaskKernel,
...     ElongatedMaskParams,
... )
>>> from image_processing.combination import sum_of_powers
>>> params = ElongatedMaskParams(
...     n_angles=18,
...     kernel_half_size=30,
...     length_falloff=0.1,
...     width_falloff=1.0,
... )
>>> detector = EdgeDetector(
...     kernel=ElongatedMaskKernel(params, device="cpu"),
...     combine_fn=sum_of_powers(3.0),
... )
>>> detector.detect(torch.rand(3, 128, 128)).shape
torch.Size([128, 128])
detect(image)[source]#

Detect edges in a single image.

Parameters:

image (Tensor) – Input image of shape (C, H, W) or (H, W). Grayscale inputs (H, W) are expanded to (1, H, W) automatically. Non-float tensors are cast to float32.

Returns:

Edge map of shape (H, W) on the kernel’s device.

Return type:

Tensor

Raises:

ValueError – If image is not 2-D or 3-D.

Examples

>>> import torch
>>> from image_processing import EdgeDetector
>>> EdgeDetector().detect(torch.rand(3, 64, 64)).shape
torch.Size([64, 64])
detect_batch(images)[source]#

Detect edges in a list of images.

Images may have different spatial sizes; each is processed independently via detect().

Parameters:

images (list[Tensor]) – Each element is a (C, H, W) or (H, W) tensor.

Returns:

One (H, W) edge map per input image.

Return type:

list[Tensor]

Examples

>>> import torch
>>> from image_processing import EdgeDetector
>>> imgs = [torch.rand(3, 64, 64), torch.rand(3, 128, 96)]
>>> [m.shape for m in EdgeDetector().detect_batch(imgs)]
[torch.Size([64, 64]), torch.Size([128, 96])]
class image_processing.ElongatedMaskKernel(params=None, device=None)[source]#

Bases: BaseKernel

Rotated elongated-stripe convolution kernels for edge detection.

Implements the method from Contour-texture separation: part 2 (Antal, 2024). The construction proceeds as follows:

  1. A horizontal stripe of decaying values is placed on a blank square canvas. Weights fall off along the stripe length as 1 / (length_falloff * |j| + 1) and optionally across the width as 1 / (width_falloff * i + 1).

  2. The canvas is converted to a PIL image and rotated to n_angles orientations evenly distributed in [0°, 180°).

  3. Each rotated mask is anti-symmetrised by subtracting its 180° rotation (mirror flip). The resulting kernels detect signed intensity gradients perpendicular to the stripe.

With padding='same' the convolution output preserves the spatial size of the input image.

Parameters:
  • params (ElongatedMaskParams | None) – Kernel configuration. Uses default ElongatedMaskParams when None.

  • device (device | str | None) – Computation device. Defaults to CUDA when available, otherwise CPU.

Examples

Default configuration (10 angles, 41 x 41 kernel):

>>> from image_processing import ElongatedMaskKernel
>>> kernel = ElongatedMaskKernel()
>>> kernel.kernels.shape
torch.Size([10, 41, 41])

GPU-tuned configuration matching the notebook’s GPU example:

>>> from image_processing import ElongatedMaskKernel, ElongatedMaskParams
>>> params = ElongatedMaskParams(
...     n_angles=18,
...     kernel_half_size=30,
...     stripe_half_width=5,
...     stripe_half_length=30,
...     length_falloff=0.1,
...     width_falloff=1.0,
... )
>>> kernel = ElongatedMaskKernel(params, device="cpu")
>>> kernel.kernels.shape
torch.Size([18, 61, 61])
build()[source]#

Build the anti-symmetrised rotated stripe kernel stack.

Returns:

Shape (n_angles, 2 * kernel_half_size + 1, 2 * kernel_half_size + 1) on self.device.

Return type:

Tensor

Raises:

ValueError – If the stripe dimensions exceed kernel_half_size.

class image_processing.ElongatedMaskParams(n_angles=10, kernel_half_size=20, stripe_half_width=5, stripe_half_length=20, length_falloff=0.05, width_falloff=0.0)[source]#

Bases: BaseKernelParams

Parameters for the elongated directional stripe kernel.

The base mask is a horizontal stripe whose pixel values decay along its length and optionally across its width. The stripe is rotated to n_angles orientations and anti-symmetrised by subtracting its own 180° rotation, making each oriented kernel respond to signed intensity gradients perpendicular to the stripe direction.

Parameters:
  • stripe_half_width (int) – Number of pixel rows occupied by the stripe measured from the centre row (total stripe width = stripe_half_width rows).

  • stripe_half_length (int) – Half-length of the stripe along its axis. Column offsets span -stripe_half_length to stripe_half_length - 1.

  • length_falloff (float) – Decay coefficient along the stripe axis. The weight at column offset j is 1 / (length_falloff * |j| + 1). Set to 0.0 for a uniform-weight stripe.

  • width_falloff (float) – Decay coefficient across the stripe. The weight at row offset i is multiplied by 1 / (width_falloff * i + 1). Set to 0.0 (default) for uniform weight across all rows.

length_falloff: float = 0.05#
stripe_half_length: int = 20#
stripe_half_width: int = 5#
width_falloff: float = 0.0#
image_processing.max_abs(response)[source]#

Return the pixel-wise maximum absolute response over channels and orientations.

Parameters:

response (Tensor) – Shape (C, N, H, W).

Returns:

Shape (H, W).

Return type:

Tensor

Examples

>>> import torch
>>> from image_processing.combination import max_abs
>>> r = torch.randn(3, 10, 64, 64)
>>> max_abs(r).shape
torch.Size([64, 64])
image_processing.sum_of_abs(response)[source]#

Sum absolute responses over all colour channels and kernel orientations.

Parameters:

response (Tensor) – Shape (C, N, H, W).

Returns:

Shape (H, W).

Return type:

Tensor

Examples

>>> import torch
>>> from image_processing.combination import sum_of_abs
>>> r = torch.randn(3, 10, 64, 64)
>>> sum_of_abs(r).shape
torch.Size([64, 64])
image_processing.sum_of_powers(power)[source]#

Return a combination function that sums |response|^power.

Higher powers emphasise strong responses and suppress weak ones, which sharpens detected edges at the cost of sensitivity. power=1 is equivalent to sum_of_abs(); power=2 is equivalent to sum_of_squares().

Parameters:

power (float) – Exponent applied to the absolute response before summation.

Returns:

A callable with signature (C, N, H, W) (H, W).

Return type:

CombineFn

Examples

>>> import torch
>>> from image_processing.combination import sum_of_powers
>>> fn = sum_of_powers(3.0)
>>> fn(torch.randn(3, 10, 64, 64)).shape
torch.Size([64, 64])
image_processing.sum_of_squares(response)[source]#

Sum squared responses over all colour channels and kernel orientations.

Parameters:

response (Tensor) – Shape (C, N, H, W).

Returns:

Shape (H, W).

Return type:

Tensor

Examples

>>> import torch
>>> from image_processing.combination import sum_of_squares
>>> r = torch.randn(3, 10, 64, 64)
>>> sum_of_squares(r).shape
torch.Size([64, 64])