torchvision.ops¶
torchvision.ops
implements operators that are specific for Computer Vision.
Note
Those operators currently do not support TorchScript.
-
torchvision.ops.
nms
(boxes, scores, iou_threshold)[source]¶ Performs non-maximum suppression (NMS) on the boxes according to their intersection-over-union (IoU).
NMS iteratively removes lower scoring boxes which have an IoU greater than iou_threshold with another (higher scoring) box.
- Parameters
- Returns
keep – int64 tensor with the indices of the elements that have been kept by NMS, sorted in decreasing order of scores
- Return type
-
torchvision.ops.
roi_align
(input, boxes, output_size, spatial_scale=1.0, sampling_ratio=-1, aligned=False)[source]¶ Performs Region of Interest (RoI) Align operator described in Mask R-CNN
- Parameters
input (Tensor[N, C, H, W]) – input tensor
boxes (Tensor[K, 5] or List[Tensor[L, 4]]) – the box coordinates in (x1, y1, x2, y2) format where the regions will be taken from. If a single Tensor is passed, then the first column should contain the batch index. If a list of Tensors is passed, then each Tensor will correspond to the boxes for an element i in a batch
output_size (int or Tuple[int, int]) – the size of the output after the cropping is performed, as (height, width)
spatial_scale (float) – a scaling factor that maps the input coordinates to the box coordinates. Default: 1.0
sampling_ratio (int) – number of sampling points in the interpolation grid used to compute the output value of each pooled output bin. If > 0, then exactly sampling_ratio x sampling_ratio grid points are used. If <= 0, then an adaptive number of grid points are used (computed as ceil(roi_width / pooled_w), and likewise for height). Default: -1
aligned (bool) – If False, use the legacy implementation. If True, pixel shift it by -0.5 for align more perfectly about two neighboring pixel indices. This version in Detectron2
- Returns
output (Tensor[K, C, output_size[0], output_size[1]])
-
torchvision.ops.
roi_pool
(input, boxes, output_size, spatial_scale=1.0)[source]¶ Performs Region of Interest (RoI) Pool operator described in Fast R-CNN
- Parameters
input (Tensor[N, C, H, W]) – input tensor
boxes (Tensor[K, 5] or List[Tensor[L, 4]]) – the box coordinates in (x1, y1, x2, y2) format where the regions will be taken from. If a single Tensor is passed, then the first column should contain the batch index. If a list of Tensors is passed, then each Tensor will correspond to the boxes for an element i in a batch
output_size (int or Tuple[int, int]) – the size of the output after the cropping is performed, as (height, width)
spatial_scale (float) – a scaling factor that maps the input coordinates to the box coordinates. Default: 1.0
- Returns
output (Tensor[K, C, output_size[0], output_size[1]])