image_processing package#
Submodules#
image_processing.combination module#
Combination functions for aggregating per-orientation convolution responses.
A combination function receives the full (C, N, H, W) response tensor
produced by applying N oriented kernels to a C-channel image and
reduces it to a (H, W) edge-strength map.
All built-in functions expect the response to have at least two dimensions
(C, N, ...); additional spatial dimensions are preserved.
- type image_processing.combination.CombineFn = Callable[[Tensor], Tensor]#
- image_processing.combination.max_abs(response)[source]#
Return the pixel-wise maximum absolute response over channels and orientations.
- Parameters:
response (
Tensor) – Shape(C, N, H, W).- Returns:
Shape
(H, W).- Return type:
Tensor
Examples
>>> import torch >>> from image_processing.combination import max_abs >>> r = torch.randn(3, 10, 64, 64) >>> max_abs(r).shape torch.Size([64, 64])
- image_processing.combination.sum_of_abs(response)[source]#
Sum absolute responses over all colour channels and kernel orientations.
- Parameters:
response (
Tensor) – Shape(C, N, H, W).- Returns:
Shape
(H, W).- Return type:
Tensor
Examples
>>> import torch >>> from image_processing.combination import sum_of_abs >>> r = torch.randn(3, 10, 64, 64) >>> sum_of_abs(r).shape torch.Size([64, 64])
- image_processing.combination.sum_of_powers(power)[source]#
Return a combination function that sums
|response|^power.Higher powers emphasise strong responses and suppress weak ones, which sharpens detected edges at the cost of sensitivity.
power=1is equivalent tosum_of_abs();power=2is equivalent tosum_of_squares().- Parameters:
power (
float) – Exponent applied to the absolute response before summation.- Returns:
A callable with signature
(C, N, H, W) → (H, W).- Return type:
Callable[[Tensor],Tensor]
Examples
>>> import torch >>> from image_processing.combination import sum_of_powers >>> fn = sum_of_powers(3.0) >>> fn(torch.randn(3, 10, 64, 64)).shape torch.Size([64, 64])
- image_processing.combination.sum_of_squares(response)[source]#
Sum squared responses over all colour channels and kernel orientations.
- Parameters:
response (
Tensor) – Shape(C, N, H, W).- Returns:
Shape
(H, W).- Return type:
Tensor
Examples
>>> import torch >>> from image_processing.combination import sum_of_squares >>> r = torch.randn(3, 10, 64, 64) >>> sum_of_squares(r).shape torch.Size([64, 64])
image_processing.detector module#
Edge detector that applies a kernel stack to an image and combines responses.
- class image_processing.detector.EdgeDetector(kernel=None, combine_fn=None, normalize=True)[source]#
Bases:
objectApply a stack of convolution kernels to detect edges in an image.
The detector runs the following steps:
Moves the input image to the kernel’s device and casts it to
float32if needed.Applies all
Nkernel orientations to each of theCcolour channels viatorch.nn.functional.conv2d()withpadding='same', producing a(C, N, H, W)response tensor.Passes the response tensor to
combine_fnto obtain a(H, W)edge-strength map.Optionally normalises the map to
[0, 1].
The output spatial size always matches the input spatial size.
- Parameters:
kernel (
BaseKernel|None) – Provider of the oriented kernel stack. Defaults toElongatedMaskKernelwith default parameters.combine_fn (
Callable[[Tensor],Tensor] |None) – Callable(C, N, H, W) → (H, W)that aggregates the response tensor. Defaults tosum_of_squares().normalize (
bool) – WhenTrue(default), divides the edge map by its maximum so the output lies in[0, 1]. Has no effect when the map is all zeros.
Examples
Minimal usage with default parameters:
>>> import torch >>> from image_processing import EdgeDetector >>> detector = EdgeDetector() >>> image = torch.rand(3, 256, 256) # (C, H, W) float image >>> edges = detector.detect(image) >>> edges.shape torch.Size([256, 256])
Custom kernel and combination function:
>>> from image_processing import ( ... EdgeDetector, ... ElongatedMaskKernel, ... ElongatedMaskParams, ... ) >>> from image_processing.combination import sum_of_powers >>> params = ElongatedMaskParams( ... n_angles=18, ... kernel_half_size=30, ... length_falloff=0.1, ... width_falloff=1.0, ... ) >>> detector = EdgeDetector( ... kernel=ElongatedMaskKernel(params, device="cpu"), ... combine_fn=sum_of_powers(3.0), ... ) >>> detector.detect(torch.rand(3, 128, 128)).shape torch.Size([128, 128])
- detect(image)[source]#
Detect edges in a single image.
- Parameters:
image (
Tensor) – Input image of shape(C, H, W)or(H, W). Grayscale inputs(H, W)are expanded to(1, H, W)automatically. Non-float tensors are cast tofloat32.- Returns:
Edge map of shape
(H, W)on the kernel’s device.- Return type:
Tensor- Raises:
ValueError – If
imageis not 2-D or 3-D.
Examples
>>> import torch >>> from image_processing import EdgeDetector >>> EdgeDetector().detect(torch.rand(3, 64, 64)).shape torch.Size([64, 64])
- detect_batch(images)[source]#
Detect edges in a list of images.
Images may have different spatial sizes; each is processed independently via
detect().- Parameters:
images (
list[Tensor]) – Each element is a(C, H, W)or(H, W)tensor.- Returns:
One
(H, W)edge map per input image.- Return type:
list[Tensor]
Examples
>>> import torch >>> from image_processing import EdgeDetector >>> imgs = [torch.rand(3, 64, 64), torch.rand(3, 128, 96)] >>> [m.shape for m in EdgeDetector().detect_batch(imgs)] [torch.Size([64, 64]), torch.Size([128, 96])]
image_processing.kernels module#
Convolution kernel classes for edge detection.
- class image_processing.kernels.BaseKernel(params, device=None)[source]#
Bases:
ABCAbstract base class for edge-detection convolution kernels.
Concrete subclasses implement
build()to construct a(N, kH, kW)kernel stack onself.device. The result is cached after the first access and can be discarded withreset().- Parameters:
params (
BaseKernelParams) – Kernel configuration.device (
device|str|None) – Computation device. Defaults to CUDA when available, otherwise CPU.
Examples
Minimal subclass skeleton:
class MyKernel(BaseKernel): def build(self) -> torch.Tensor: # return tensor of shape (n_kernels, kH, kW) on self.device ...
- abstractmethod build()[source]#
Build and return the kernel stack.
- Returns:
Shape
(n_kernels, kH, kW)onself.device.- Return type:
Tensor
- property kernels: Tensor#
Return the lazily built and cached kernel tensor.
- Returns:
Shape
(n_kernels, kH, kW).- Return type:
torch.Tensor
- class image_processing.kernels.ElongatedMaskKernel(params=None, device=None)[source]#
Bases:
BaseKernelRotated elongated-stripe convolution kernels for edge detection.
Implements the method from Contour-texture separation: part 2 (Antal, 2024). The construction proceeds as follows:
A horizontal stripe of decaying values is placed on a blank square canvas. Weights fall off along the stripe length as
1 / (length_falloff * |j| + 1)and optionally across the width as1 / (width_falloff * i + 1).The canvas is converted to a PIL image and rotated to
n_anglesorientations evenly distributed in[0°, 180°).Each rotated mask is anti-symmetrised by subtracting its 180° rotation (mirror flip). The resulting kernels detect signed intensity gradients perpendicular to the stripe.
With
padding='same'the convolution output preserves the spatial size of the input image.- Parameters:
params (
ElongatedMaskParams|None) – Kernel configuration. Uses defaultElongatedMaskParamswhenNone.device (
device|str|None) – Computation device. Defaults to CUDA when available, otherwise CPU.
Examples
Default configuration (10 angles, 41 x 41 kernel):
>>> from image_processing import ElongatedMaskKernel >>> kernel = ElongatedMaskKernel() >>> kernel.kernels.shape torch.Size([10, 41, 41])
GPU-tuned configuration matching the notebook’s GPU example:
>>> from image_processing import ElongatedMaskKernel, ElongatedMaskParams >>> params = ElongatedMaskParams( ... n_angles=18, ... kernel_half_size=30, ... stripe_half_width=5, ... stripe_half_length=30, ... length_falloff=0.1, ... width_falloff=1.0, ... ) >>> kernel = ElongatedMaskKernel(params, device="cpu") >>> kernel.kernels.shape torch.Size([18, 61, 61])
image_processing.params module#
Parameter dataclasses for edge detection kernels.
- class image_processing.params.BaseKernelParams(n_angles=10, kernel_half_size=20)[source]#
Bases:
objectBase parameters shared by all convolution kernel families.
- Parameters:
n_angles (
int) – Number of kernel orientations, evenly spaced in[0°, 180°).kernel_half_size (
int) – Half-size of the square kernel canvas. The full kernel will be(2 * kernel_half_size) x (2 * kernel_half_size)pixels.
- kernel_half_size: int = 20#
- n_angles: int = 10#
- class image_processing.params.ElongatedMaskParams(n_angles=10, kernel_half_size=20, stripe_half_width=5, stripe_half_length=20, length_falloff=0.05, width_falloff=0.0)[source]#
Bases:
BaseKernelParamsParameters for the elongated directional stripe kernel.
The base mask is a horizontal stripe whose pixel values decay along its length and optionally across its width. The stripe is rotated to
n_anglesorientations and anti-symmetrised by subtracting its own 180° rotation, making each oriented kernel respond to signed intensity gradients perpendicular to the stripe direction.- Parameters:
stripe_half_width (
int) – Number of pixel rows occupied by the stripe measured from the centre row (total stripe width =stripe_half_widthrows).stripe_half_length (
int) – Half-length of the stripe along its axis. Column offsets span-stripe_half_lengthtostripe_half_length - 1.length_falloff (
float) – Decay coefficient along the stripe axis. The weight at column offsetjis1 / (length_falloff * |j| + 1). Set to0.0for a uniform-weight stripe.width_falloff (
float) – Decay coefficient across the stripe. The weight at row offsetiis multiplied by1 / (width_falloff * i + 1). Set to0.0(default) for uniform weight across all rows.
- length_falloff: float = 0.05#
- stripe_half_length: int = 20#
- stripe_half_width: int = 5#
- width_falloff: float = 0.0#
Module contents#
Image processing package for GPU-accelerated edge detection.
Implements the elongated-mask edge detection method from Contour-texture separation: part 2 (Antal, 2024) as an extendable, GPU-aware PyTorch package.
Quick start#
>>> import torch
>>> from image_processing import EdgeDetector
>>> detector = EdgeDetector() # uses GPU if available
>>> image = torch.rand(3, 512, 512) # (C, H, W) float image in [0, 1]
>>> edges = detector.detect(image) # (H, W) edge map in [0, 1]
Customising the kernel#
>>> from image_processing import (
... EdgeDetector,
... ElongatedMaskKernel,
... ElongatedMaskParams,
... )
>>> from image_processing.combination import sum_of_powers
>>> params = ElongatedMaskParams(
... n_angles=18,
... kernel_half_size=30,
... stripe_half_width=5,
... stripe_half_length=30,
... length_falloff=0.1,
... width_falloff=1.0,
... )
>>> detector = EdgeDetector(
... kernel=ElongatedMaskKernel(params),
... combine_fn=sum_of_powers(3.0),
... )
Extending with a custom kernel#
Subclass BaseKernel and pair it with a custom
BaseKernelParams dataclass:
from dataclasses import dataclass
import torch
from image_processing import BaseKernel, BaseKernelParams
@dataclass
class MyParams(BaseKernelParams):
sigma: float = 1.0
class MyKernel(BaseKernel):
def build(self) -> torch.Tensor:
p: MyParams = self.params # type: ignore[assignment]
# ... build and return (N, kH, kW) tensor on self.device
Public API#
EdgeDetector- applies a kernel stack and combines the responses.ElongatedMaskKernel- rotated anti-symmetric stripe kernels.ElongatedMaskParams- parameters forElongatedMaskKernel.BaseKernel- abstract base for custom kernel families.BaseKernelParams- base dataclass for kernel parameters.CombineFn- type alias for combination callables.image_processing.combination- built-in combination functions.
- class image_processing.BaseKernel(params, device=None)[source]#
Bases:
ABCAbstract base class for edge-detection convolution kernels.
Concrete subclasses implement
build()to construct a(N, kH, kW)kernel stack onself.device. The result is cached after the first access and can be discarded withreset().- Parameters:
params (
BaseKernelParams) – Kernel configuration.device (
device|str|None) – Computation device. Defaults to CUDA when available, otherwise CPU.
Examples
Minimal subclass skeleton:
class MyKernel(BaseKernel): def build(self) -> torch.Tensor: # return tensor of shape (n_kernels, kH, kW) on self.device ...
- abstractmethod build()[source]#
Build and return the kernel stack.
- Returns:
Shape
(n_kernels, kH, kW)onself.device.- Return type:
Tensor
- property kernels: Tensor#
Return the lazily built and cached kernel tensor.
- Returns:
Shape
(n_kernels, kH, kW).- Return type:
torch.Tensor
- class image_processing.BaseKernelParams(n_angles=10, kernel_half_size=20)[source]#
Bases:
objectBase parameters shared by all convolution kernel families.
- Parameters:
n_angles (
int) – Number of kernel orientations, evenly spaced in[0°, 180°).kernel_half_size (
int) – Half-size of the square kernel canvas. The full kernel will be(2 * kernel_half_size) x (2 * kernel_half_size)pixels.
- kernel_half_size: int = 20#
- n_angles: int = 10#
- type image_processing.CombineFn = Callable[[Tensor], Tensor]#
- class image_processing.EdgeDetector(kernel=None, combine_fn=None, normalize=True)[source]#
Bases:
objectApply a stack of convolution kernels to detect edges in an image.
The detector runs the following steps:
Moves the input image to the kernel’s device and casts it to
float32if needed.Applies all
Nkernel orientations to each of theCcolour channels viatorch.nn.functional.conv2d()withpadding='same', producing a(C, N, H, W)response tensor.Passes the response tensor to
combine_fnto obtain a(H, W)edge-strength map.Optionally normalises the map to
[0, 1].
The output spatial size always matches the input spatial size.
- Parameters:
kernel (
BaseKernel|None) – Provider of the oriented kernel stack. Defaults toElongatedMaskKernelwith default parameters.combine_fn (
CombineFn|None) – Callable(C, N, H, W) → (H, W)that aggregates the response tensor. Defaults tosum_of_squares().normalize (
bool) – WhenTrue(default), divides the edge map by its maximum so the output lies in[0, 1]. Has no effect when the map is all zeros.
Examples
Minimal usage with default parameters:
>>> import torch >>> from image_processing import EdgeDetector >>> detector = EdgeDetector() >>> image = torch.rand(3, 256, 256) # (C, H, W) float image >>> edges = detector.detect(image) >>> edges.shape torch.Size([256, 256])
Custom kernel and combination function:
>>> from image_processing import ( ... EdgeDetector, ... ElongatedMaskKernel, ... ElongatedMaskParams, ... ) >>> from image_processing.combination import sum_of_powers >>> params = ElongatedMaskParams( ... n_angles=18, ... kernel_half_size=30, ... length_falloff=0.1, ... width_falloff=1.0, ... ) >>> detector = EdgeDetector( ... kernel=ElongatedMaskKernel(params, device="cpu"), ... combine_fn=sum_of_powers(3.0), ... ) >>> detector.detect(torch.rand(3, 128, 128)).shape torch.Size([128, 128])
- detect(image)[source]#
Detect edges in a single image.
- Parameters:
image (
Tensor) – Input image of shape(C, H, W)or(H, W). Grayscale inputs(H, W)are expanded to(1, H, W)automatically. Non-float tensors are cast tofloat32.- Returns:
Edge map of shape
(H, W)on the kernel’s device.- Return type:
Tensor- Raises:
ValueError – If
imageis not 2-D or 3-D.
Examples
>>> import torch >>> from image_processing import EdgeDetector >>> EdgeDetector().detect(torch.rand(3, 64, 64)).shape torch.Size([64, 64])
- detect_batch(images)[source]#
Detect edges in a list of images.
Images may have different spatial sizes; each is processed independently via
detect().- Parameters:
images (
list[Tensor]) – Each element is a(C, H, W)or(H, W)tensor.- Returns:
One
(H, W)edge map per input image.- Return type:
list[Tensor]
Examples
>>> import torch >>> from image_processing import EdgeDetector >>> imgs = [torch.rand(3, 64, 64), torch.rand(3, 128, 96)] >>> [m.shape for m in EdgeDetector().detect_batch(imgs)] [torch.Size([64, 64]), torch.Size([128, 96])]
- class image_processing.ElongatedMaskKernel(params=None, device=None)[source]#
Bases:
BaseKernelRotated elongated-stripe convolution kernels for edge detection.
Implements the method from Contour-texture separation: part 2 (Antal, 2024). The construction proceeds as follows:
A horizontal stripe of decaying values is placed on a blank square canvas. Weights fall off along the stripe length as
1 / (length_falloff * |j| + 1)and optionally across the width as1 / (width_falloff * i + 1).The canvas is converted to a PIL image and rotated to
n_anglesorientations evenly distributed in[0°, 180°).Each rotated mask is anti-symmetrised by subtracting its 180° rotation (mirror flip). The resulting kernels detect signed intensity gradients perpendicular to the stripe.
With
padding='same'the convolution output preserves the spatial size of the input image.- Parameters:
params (
ElongatedMaskParams|None) – Kernel configuration. Uses defaultElongatedMaskParamswhenNone.device (
device|str|None) – Computation device. Defaults to CUDA when available, otherwise CPU.
Examples
Default configuration (10 angles, 41 x 41 kernel):
>>> from image_processing import ElongatedMaskKernel >>> kernel = ElongatedMaskKernel() >>> kernel.kernels.shape torch.Size([10, 41, 41])
GPU-tuned configuration matching the notebook’s GPU example:
>>> from image_processing import ElongatedMaskKernel, ElongatedMaskParams >>> params = ElongatedMaskParams( ... n_angles=18, ... kernel_half_size=30, ... stripe_half_width=5, ... stripe_half_length=30, ... length_falloff=0.1, ... width_falloff=1.0, ... ) >>> kernel = ElongatedMaskKernel(params, device="cpu") >>> kernel.kernels.shape torch.Size([18, 61, 61])
- class image_processing.ElongatedMaskParams(n_angles=10, kernel_half_size=20, stripe_half_width=5, stripe_half_length=20, length_falloff=0.05, width_falloff=0.0)[source]#
Bases:
BaseKernelParamsParameters for the elongated directional stripe kernel.
The base mask is a horizontal stripe whose pixel values decay along its length and optionally across its width. The stripe is rotated to
n_anglesorientations and anti-symmetrised by subtracting its own 180° rotation, making each oriented kernel respond to signed intensity gradients perpendicular to the stripe direction.- Parameters:
stripe_half_width (
int) – Number of pixel rows occupied by the stripe measured from the centre row (total stripe width =stripe_half_widthrows).stripe_half_length (
int) – Half-length of the stripe along its axis. Column offsets span-stripe_half_lengthtostripe_half_length - 1.length_falloff (
float) – Decay coefficient along the stripe axis. The weight at column offsetjis1 / (length_falloff * |j| + 1). Set to0.0for a uniform-weight stripe.width_falloff (
float) – Decay coefficient across the stripe. The weight at row offsetiis multiplied by1 / (width_falloff * i + 1). Set to0.0(default) for uniform weight across all rows.
- length_falloff: float = 0.05#
- stripe_half_length: int = 20#
- stripe_half_width: int = 5#
- width_falloff: float = 0.0#
- image_processing.max_abs(response)[source]#
Return the pixel-wise maximum absolute response over channels and orientations.
- Parameters:
response (
Tensor) – Shape(C, N, H, W).- Returns:
Shape
(H, W).- Return type:
Tensor
Examples
>>> import torch >>> from image_processing.combination import max_abs >>> r = torch.randn(3, 10, 64, 64) >>> max_abs(r).shape torch.Size([64, 64])
- image_processing.sum_of_abs(response)[source]#
Sum absolute responses over all colour channels and kernel orientations.
- Parameters:
response (
Tensor) – Shape(C, N, H, W).- Returns:
Shape
(H, W).- Return type:
Tensor
Examples
>>> import torch >>> from image_processing.combination import sum_of_abs >>> r = torch.randn(3, 10, 64, 64) >>> sum_of_abs(r).shape torch.Size([64, 64])
- image_processing.sum_of_powers(power)[source]#
Return a combination function that sums
|response|^power.Higher powers emphasise strong responses and suppress weak ones, which sharpens detected edges at the cost of sensitivity.
power=1is equivalent tosum_of_abs();power=2is equivalent tosum_of_squares().- Parameters:
power (
float) – Exponent applied to the absolute response before summation.- Returns:
A callable with signature
(C, N, H, W) → (H, W).- Return type:
Examples
>>> import torch >>> from image_processing.combination import sum_of_powers >>> fn = sum_of_powers(3.0) >>> fn(torch.randn(3, 10, 64, 64)).shape torch.Size([64, 64])
- image_processing.sum_of_squares(response)[source]#
Sum squared responses over all colour channels and kernel orientations.
- Parameters:
response (
Tensor) – Shape(C, N, H, W).- Returns:
Shape
(H, W).- Return type:
Tensor
Examples
>>> import torch >>> from image_processing.combination import sum_of_squares >>> r = torch.randn(3, 10, 64, 64) >>> sum_of_squares(r).shape torch.Size([64, 64])