Module `julius.core`

Signal processing or TensorFlow related utilities.

Functions

def conv1d(input: tensorflow.python.framework.tensor.Tensor, weight: tensorflow.python.framework.tensor.Tensor, stride: int = 1, padding: int = 0) ‑> tensorflow.python.framework.tensor.Tensor

Expand source code Browse git

def conv1d(input: tf.Tensor, weight: tf.Tensor, stride: int = 1, padding: int = 0) -> tf.Tensor:
    """
    1D convolution (cross-correlation) following the `torch.nn.functional.conv1d` convention,
    implemented on top of `tf.nn.conv1d`.

    Args:
        input (Tensor): input signal of shape `[B, C, T]` (channels first).
        weight (Tensor): convolution weight of shape `[D, C, K]` with `D` the number
            of output channels.
        stride (int): stride of the convolution.
        padding (int): amount of zero padding applied to both sides of the input.
    """
    if padding:
        input = _pad_last(input, padding, padding, value=0.)
    x = tf.transpose(input, [0, 2, 1])  # [B, T, C], i.e. NWC as expected by tf.nn.conv1d
    w = tf.transpose(weight, [2, 1, 0])  # [K, C, D]
    y = tf.nn.conv1d(x, w, stride=stride, padding='VALID')
    return tf.transpose(y, [0, 2, 1])  # back to [B, D, T']

1D convolution (cross-correlation) following the torch.nn.functional.conv1d convention, implemented on top of tf.nn.conv1d.

Args

input : Tensor: input signal of shape [B, C, T] (channels first).
weight : Tensor: convolution weight of shape [D, C, K] with D the number of output channels.
stride : int: stride of the convolution.
padding : int: amount of zero padding applied to both sides of the input.

def hz_to_mel(freqs: tensorflow.python.framework.tensor.Tensor)

Expand source code Browse git

def hz_to_mel(freqs: tf.Tensor):
    """
    Converts a Tensor of frequencies in hertz to the mel scale.
    Uses the simple formula by O'Shaughnessy (1987).

    Args:
        freqs (tf.Tensor): frequencies to convert.

    """
    return 2595 * _log10(1 + freqs / 700)

Converts a Tensor of frequencies in hertz to the mel scale. Uses the simple formula by O'Shaughnessy (1987).

Args

freqs : tf.Tensor: frequencies to convert.

def mel_frequencies(n_mels: int, fmin: float, fmax: float)

Expand source code Browse git

def mel_frequencies(n_mels: int, fmin: float, fmax: float):
    """
    Return frequencies that are evenly spaced in mel scale.

    Args:
        n_mels (int): number of frequencies to return.
        fmin (float): start from this frequency (in Hz).
        fmax (float): finish at this frequency (in Hz).


    """
    low = float(hz_to_mel(tf.constant(float(fmin), tf.float32)))
    high = float(hz_to_mel(tf.constant(float(fmax), tf.float32)))
    mels = tf.linspace(low, high, n_mels)
    return mel_to_hz(mels)

Return frequencies that are evenly spaced in mel scale.

Args

n_mels : int: number of frequencies to return.
fmin : float: start from this frequency (in Hz).
fmax : float: finish at this frequency (in Hz).

def mel_to_hz(mels: tensorflow.python.framework.tensor.Tensor)

Expand source code Browse git

def mel_to_hz(mels: tf.Tensor):
    """
    Converts a Tensor of mel scaled frequencies to Hertz.
    Uses the simple formula by O'Shaughnessy (1987).

    Args:
        mels (tf.Tensor): mel frequencies to convert.
    """
    return 700 * (tf.pow(tf.cast(10.0, mels.dtype), mels / 2595) - 1)

Converts a Tensor of mel scaled frequencies to Hertz. Uses the simple formula by O'Shaughnessy (1987).

Args

mels : tf.Tensor: mel frequencies to convert.

def pad_replicate(x: tensorflow.python.framework.tensor.Tensor, left: int, right: int) ‑> tensorflow.python.framework.tensor.Tensor

Expand source code Browse git

def pad_replicate(x: tf.Tensor, left: int, right: int) -> tf.Tensor:
    """
    Pad the last dimension of `x` by replicating its edge values (equivalent to
    PyTorch's ``mode='replicate'``). This avoids discontinuities at the borders that
    would otherwise create strong artifacts when filtering.
    """
    left_pad = tf.repeat(x[..., :1], left, axis=-1)
    right_pad = tf.repeat(x[..., -1:], right, axis=-1)
    return tf.concat([left_pad, x, right_pad], axis=-1)

Pad the last dimension of x by replicating its edge values (equivalent to PyTorch's mode='replicate'). This avoids discontinuities at the borders that would otherwise create strong artifacts when filtering.

def pad_to(tensor: tensorflow.python.framework.tensor.Tensor, target_length: int, mode: str = 'constant', value: float = 0.0)

Expand source code Browse git

def pad_to(tensor: tf.Tensor, target_length: int, mode: str = 'constant', value: float = 0.):
    """
    Pad the given tensor to the given length, with 0s on the right.
    """
    return _pad_last(tensor, 0, target_length - tf.shape(tensor)[-1], value=value)

Pad the given tensor to the given length, with 0s on the right.

def pure_tone(freq: float, sr: float = 128, dur: float = 4)

Expand source code Browse git

def pure_tone(freq: float, sr: float = 128, dur: float = 4):
    """
    Return a pure tone, i.e. cosine.

    Args:
        freq (float): frequency (in Hz)
        sr (float): sample rate (in Hz)
        dur (float): duration (in seconds)
    """
    time = tf.range(int(sr * dur), dtype=tf.float32) / sr
    return tf.cos(2 * math.pi * freq * time)

Return a pure tone, i.e. cosine.

Args

freq : float: frequency (in Hz)
sr : float: sample rate (in Hz)
dur : float: duration (in seconds)

def sinc(x: tensorflow.python.framework.tensor.Tensor) ‑> tensorflow.python.framework.tensor.Tensor

Expand source code Browse git

def sinc(x: tf.Tensor) -> tf.Tensor:
    """
    Implementation of sinc, i.e. sin(x) / x

    __Warning__: the input is not multiplied by `pi`!
    """
    return tf.where(tf.equal(x, tf.cast(0.0, x.dtype)), tf.ones_like(x), tf.sin(x) / x)

Implementation of sinc, i.e. sin(x) / x

Warning: the input is not multiplied by pi!

def unfold(input: tensorflow.python.framework.tensor.Tensor, kernel_size: int, stride: int)

Expand source code Browse git

def unfold(input: tf.Tensor, kernel_size: int, stride: int):
    """1D only unfolding similar to the one from PyTorch.

    Given an input tensor of size `[*, T]` this will return
    a tensor `[*, F, K]` with `K` the kernel size, and `F` the number
    of frames. The i-th frame is `i * stride: i * stride + kernel_size`.
    This will automatically pad the input to cover at least once all entries in `input`.

    Args:
        input (Tensor): tensor for which to return the frames.
        kernel_size (int): size of each frame.
        stride (int): stride between each frame.

    Shape:

        - Inputs: `input` is `[*, T]`
        - Output: `[*, F, kernel_size]` with `F = 1 + ceil((T - kernel_size) / stride)`


    ..Warning:: unlike PyTorch unfold, this will pad the input
        so that any position in `input` is covered by at least one frame.

    Implemented on top of `tf.signal.frame`, the natural TensorFlow primitive for this.
    """
    length = tf.shape(input)[-1]
    covered = tf.maximum(length, kernel_size) - kernel_size
    n_frames = (covered + stride - 1) // stride + 1
    tgt_length = (n_frames - 1) * stride + kernel_size
    padded = _pad_last(input, 0, tgt_length - length, value=0.)
    return tf.signal.frame(padded, kernel_size, stride, axis=-1)

1D only unfolding similar to the one from PyTorch.

Given an input tensor of size [*, T] this will return a tensor [*, F, K] with K the kernel size, and F the number of frames. The i-th frame is i * stride: i * stride + kernel_size. This will automatically pad the input to cover at least once all entries in input.

Args

input : Tensor: tensor for which to return the frames.
kernel_size : int: size of each frame.
stride : int: stride between each frame.

Shape

Inputs: input is [*, T]
Output: [*, F, kernel_size] with F = 1 + ceil((T - kernel_size) / stride)

Warning: unlike PyTorch unfold, this will pad the input

so that any position in input is covered by at least one frame.

Implemented on top of tf.signal.frame, the natural TensorFlow primitive for this.

def volume(x: tensorflow.python.framework.tensor.Tensor, floor=1e-08)

Expand source code Browse git

def volume(x: tf.Tensor, floor=1e-8):
    """
    Return the volume in dBFS.
    """
    return _log10(floor + tf.reduce_mean(x**2, axis=-1)) * 10

Return the volume in dBFS.