espnet2.diar package¶
espnet2.diar.__init__¶
espnet2.diar.abs_diar¶
-
class
espnet2.diar.abs_diar.
AbsDiarization
[source]¶ Bases:
torch.nn.modules.module.Module
,abc.ABC
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
abstract
forward
(input: torch.Tensor, ilens: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor, collections.OrderedDict][source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
abstract
espnet2.diar.espnet_model¶
-
class
espnet2.diar.espnet_model.
ESPnetDiarizationModel
(frontend: Optional[espnet2.asr.frontend.abs_frontend.AbsFrontend], specaug: Optional[espnet2.asr.specaug.abs_specaug.AbsSpecAug], normalize: Optional[espnet2.layers.abs_normalize.AbsNormalize], label_aggregator: torch.nn.modules.module.Module, encoder: espnet2.asr.encoder.abs_encoder.AbsEncoder, decoder: espnet2.diar.decoder.abs_decoder.AbsDecoder, attractor: Optional[espnet2.diar.attractor.abs_attractor.AbsAttractor], attractor_weight: float = 1.0)[source]¶ Bases:
espnet2.train.abs_espnet_model.AbsESPnetModel
Speaker Diarization model
If “attractor” is “None”, SA-EEND will be used. Else if “attractor” is not “None”, EEND-EDA will be used. For the details about SA-EEND and EEND-EDA, refer to the following papers: SA-EEND: https://arxiv.org/pdf/1909.06247.pdf EEND-EDA: https://arxiv.org/pdf/2005.09921.pdf, https://arxiv.org/pdf/2106.10654.pdf
-
collect_feats
(speech: torch.Tensor, speech_lengths: torch.Tensor, spk_labels: torch.Tensor = None, spk_labels_lengths: torch.Tensor = None) → Dict[str, torch.Tensor][source]¶
-
encode
(speech: torch.Tensor, speech_lengths: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor][source]¶ Frontend + Encoder
- Parameters
speech – (Batch, Length, …)
speech_lengths – (Batch,)
-
forward
(speech: torch.Tensor, speech_lengths: torch.Tensor = None, spk_labels: torch.Tensor = None, spk_labels_lengths: torch.Tensor = None) → Tuple[torch.Tensor, Dict[str, torch.Tensor], torch.Tensor][source]¶ Frontend + Encoder + Decoder + Calc loss
- Parameters
speech – (Batch, samples)
speech_lengths – (Batch,) default None for chunk interator, because the chunk-iterator does not have the speech_lengths returned. see in espnet2/iterators/chunk_iter_factory.py
spk_labels – (Batch, )
-
espnet2.diar.label_processor¶
espnet2.diar.attractor.__init__¶
espnet2.diar.attractor.abs_attractor¶
-
class
espnet2.diar.attractor.abs_attractor.
AbsAttractor
[source]¶ Bases:
torch.nn.modules.module.Module
,abc.ABC
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
abstract
forward
(enc_input: torch.Tensor, ilens: torch.Tensor, dec_input: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor][source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
abstract
espnet2.diar.attractor.rnn_attractor¶
-
class
espnet2.diar.attractor.rnn_attractor.
RnnAttractor
(encoder_output_size: int, layer: int = 1, unit: int = 512, dropout: float = 0.1, attractor_grad: bool = True)[source]¶ Bases:
espnet2.diar.attractor.abs_attractor.AbsAttractor
encoder decoder attractor for speaker diarization
-
forward
(enc_input: torch.Tensor, ilens: torch.Tensor, dec_input: torch.Tensor)[source]¶ Forward.
- Parameters
enc_input (torch.Tensor) – hidden_space [Batch, T, F]
ilens (torch.Tensor) – input lengths [Batch]
dec_input (torch.Tensor) – decoder input (zeros) [Batch, num_spk + 1, F]
- Returns
[Batch, num_spk + 1, F] att_prob: [Batch, num_spk + 1, 1]
- Return type
attractor
-
espnet2.diar.decoder.__init__¶
espnet2.diar.decoder.abs_decoder¶
-
class
espnet2.diar.decoder.abs_decoder.
AbsDecoder
[source]¶ Bases:
torch.nn.modules.module.Module
,abc.ABC
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
abstract
forward
(input: torch.Tensor, ilens: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor][source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
abstract property
num_spk
¶
-
abstract
espnet2.diar.decoder.linear_decoder¶
-
class
espnet2.diar.decoder.linear_decoder.
LinearDecoder
(encoder_output_size: int, num_spk: int = 2)[source]¶ Bases:
espnet2.diar.decoder.abs_decoder.AbsDecoder
Linear decoder for speaker diarization
-
forward
(input: torch.Tensor, ilens: torch.Tensor)[source]¶ Forward.
- Parameters
input (torch.Tensor) – hidden_space [Batch, T, F]
ilens (torch.Tensor) – input lengths [Batch]
-
property
num_spk
¶
-