Files
Mangio-RVC-Fork/train/__pycache__/data_utils.cpython-39.pyc

121 lines
13 KiB
Plaintext
Raw Normal View History

2023-03-31 17:47:00 +08:00
a
Q"d6G<00>@s<>ddlZddlZddlZddlZddlZddlmZddlm Z m
Z
Gdd<05>dejj j <0C>Z Gdd<07>d<07>ZGdd <09>d ejj j <0C>ZGd
d <0B>d <0B>ZGd d <0A>d ejj jj<12>ZdS)<0E>N)<01>spectrogram_torch)<02>load_wav_to_torch<63>load_filepaths_and_textc@sPeZdZdZdd<03>Zdd<05>Zdd<07>Zdd <09>Zd
d <0B>Zd d <0A>Z dd<0F>Z
dd<11>Z dS)<13>TextAudioLoaderMultiNSFsid<69><64>
1) loads audio, text pairs
2) normalizes text and converts them to sequences of integers
3) computes spectrograms from audio files.
cCsbt|<01>|_|j|_|j|_|j|_|j|_|j|_|j|_t|dd<02>|_t|dd<04>|_ |<00>
<EFBFBD>dS<00>N<> min_text_len<65><00> max_text_leni<6E><00> r<00>audiopaths_and_text<78> max_wav_value<75> sampling_rate<74> filter_length<74>
hop_length<EFBFBD>
win_length<EFBFBD>getattrrr
<00>_filter<65><03>selfr <00>hparams<6D>r<00>3/data/docker/liujing04/vc-webui/train/data_utils.py<70>__init__s
z#TextAudioLoaderMultiNSFsid.__init__cCszg}g}|jD]Z\}}}}}|jt|<04>krt|<04>|jkr|<01>|||||g<05>|<02>tj<06>|<03>d|j<00>q||_||_ dS<00>z2
Filter text & store spec lengths
<20>N<>
r r<00>lenr
<00>append<6E>os<6F>path<74>getsizer<00>lengths)r<00>audiopaths_and_text_newr"<00> audiopath<74>text<78>pitch<63>pitchf<68>dvrrrrsz"TextAudioLoaderMultiNSFsid._filtercCst<00>t|<01>g<01>}|S<00>N<><03>torch<63>
LongTensor<EFBFBD>int<6E>r<00>sidrrr<00>get_sid+sz"TextAudioLoaderMultiNSFsid.get_sidc Cs<>|d}|d}|d}|d}|d}|<00>|||<05>\}}}|<00>|<02>\}}|<00>|<06>}|<03><03>d} |<07><03>d}
| |
kr<>t| |
<EFBFBD>} | |j} |dd<00>d| <0B>f}|dd<00>d| <0C>f}|d| <0B>dd<00>f}|d| <0B>}|d| <0B>}||||||fS)Nrr r<00><00><00><><EFBFBD><EFBFBD><EFBFBD><EFBFBD><06>
get_labels<EFBFBD> get_audior0<00>size<7A>minr) r<00>audiopath_and_text<78>file<6C>phoner&r'r(<00>spec<65>wav<61> len_phone<6E>len_spec<65>len_min<69>len_wavrrr<00>get_audio_text_pair/s&
  

  z.TextAudioLoaderMultiNSFsid.get_audio_text_paircCs<>t<00>|<01>}tj|ddd<03>}t<00>|<02>}t<00>|<03>}t|jdd<04>}|d|<04>dd<00>f}|d|<04>}|d|<04>}t<05>|<01>}t<05>|<02>}t<05>|<03>}|||fS<00>Nrr)<01>axisi<73>)<08>np<6E>load<61>repeatr8<00>shaper+<00> FloatTensorr,)rr;r&r'<00>n_numrrrr5Ls


  


z%TextAudioLoaderMultiNSFsid.get_labelsc Cs<>t|<01>\}}||jkr(td<01>||j<01><02><01>||j}|<04>d<02>}|<01>dd<04>}tj<08> |<05>r<>zt
<EFBFBD> |<05>}Wq<>t |t <0A><0E><00>t||j|j|j|jdd<06>}t
<EFBFBD>|d<02>}t
j||dd<07>Yq<>0n8t||j|j|j|jdd<06>}t
<EFBFBD>|d<02>}t
j||dd<07>||fS<00>Nz {} SR doesn't match target {} SRrz.wavz.spec.ptF)<01>center)<01>_use_new_zipfile_serialization<6F>rr<00>
ValueError<EFBFBD>formatr <00> unsqueeze<7A>replacerr <00>existsr+rF<00>print<6E> traceback<63>
format_excrrrr<00>squeeze<7A>save<76>r<00>filename<6D>audiorZ
audio_normZ spec_filenamer<rrrr6[s@ 
<06><02>

   <02> <02> z$TextAudioLoaderMultiNSFsid.get_audiocCs|<00>|j|<00>Sr)<00>rBr <00>r<00>indexrrr<00> __getitem__}sz&TextAudioLoaderMultiNSFsid.__getitem__cCs
t|j<01>Sr)<00>rr <00>rrrr<00>__len__<5F>sz"TextAudioLoaderMultiNSFsid.__len__N<5F> <0C>__name__<5F>
__module__<EFBFBD> __qualname__<5F>__doc__rrr0rBr5r6r_rbrrrrr s "rc@s"eZdZdZddd<04>Zdd<06>ZdS) <09>TextAudioCollateMultiNSFsid<69>"Zero-pads model inputs and targetsFcCs
||_dSr)<00><01>
return_ids<EFBFBD>rrkrrrr<00>sz$TextAudioCollateMultiNSFsid.__init__c Cs@tjt<00>dd<02>|D<00><01>ddd<05>\}}tdd<02>|D<00><01>}tdd<02>|D<00><01>}t<00>t|<01><01>}t<00>t|<01><01>}t<00>t|<01>|dd<00>d<03>|<04>}t<00>t|<01>d|<05>} |<08><07>| <09><07>td d<02>|D<00><01>}
t<00>t|<01><01>} t<00>t|<01>|
|dd
jd<00>} t<00>t|<01>|
<EFBFBD>} t<00>t|<01>|
<EFBFBD>}| <0C><07>| <0A><07>|<0E><07>t<00>t|<01><01>}t t|<03><01>D]<5D>}|||}|d}|||d d <0B>d |<12>d<08><01>f<|<12>d<08>||<|d}|| |d d <0B>d |<13>d<08><01>f<|<13>d<08>||<|d
}|| |d |<14>d<03><01>d d <0B>f<|<14>d<03>| |<|d }|| |d |<15>d<03><01>f<|d }|||d |<16>d<03><01>f<|d||<<00>q8| | | |||| ||f S)<0F><>Collate's training batch from normalized text and aduio
PARAMS
------
batch: [text_normalized, spec_normalized, wav_normalized]
cSsg|]}|d<00>d<01><01>qS<00>rr <00>r7<00><02>.0<EFBFBD>xrrr<00>
<listcomp><3E><00>z8TextAudioCollateMultiNSFsid.__call__.<locals>.<listcomp>rT<><02>dim<69>
descendingcSsg|]}|d<00>d<01><01>qSrnrorprrrrs<00>rtcSsg|]}|d<00>d<00><01>qS<00>r rorprrrrs<00>rtr cSsg|]}|d<00>d<01><01>qS<00>rrrorprrrrs<00>rtrNr1r2<00><00>
r+<00>sortr,<00>maxrrIr7<00>zero_rH<00>range)r<00>batch<63>_<>ids_sorted_decreasing<6E> max_spec_len<65> max_wave_len<65> spec_lengths<68> wave_lengths<68> spec_padded<65> wave_padded<65> max_phone_len<65> phone_lengths<68> phone_paddedZ pitch_paddedZ pitchf_paddedr/<00>i<>rowr<<00>waver;r&r'rrr<00>__call__<5F>s\<16>
   <02>z$TextAudioCollateMultiNSFsid.__call__N)F<>rdrerfrgrr<>rrrrrh<00>s
rhc@sPeZdZdZdd<03>Zdd<05>Zdd<07>Zdd <09>Zd
d <0B>Zd d <0A>Z dd<0F>Z
dd<11>Z dS)<13>TextAudioLoaderrcCsbt|<01>|_|j|_|j|_|j|_|j|_|j|_|j|_t|dd<02>|_t|dd<04>|_ |<00>
<EFBFBD>dSrr rrrrr<00>s
zTextAudioLoader.__init__cCsrg}g}|jD]R\}}}|jt|<04>krt|<04>|jkr|<01>|||g<03>|<02>tj<06>|<03>d|j<00>q||_||_ dSrr)rr#r"r$r%r(rrrr<00>szTextAudioLoader._filtercCst<00>t|<01>g<01>}|Sr)r*r.rrrr0<00>szTextAudioLoader.get_sidc Cs<>|d}|d}|d}|<00>|<03>}|<00>|<02>\}}|<00>|<04>}|<03><03>d}|<05><03>d}||kr<>t||<08>} | |j}
|dd<00>d| <09>f}|dd<00>d|
<EFBFBD>f}|d| <09>dd<00>f}||||fS)Nrr rr3r4) rr9r:r;r(r<r=r>r?r@rArrrrB<00>s

  

z#TextAudioLoader.get_audio_text_paircCsLt<00>|<01>}tj|ddd<03>}t|jdd<04>}|d|<02>dd<00>f}t<05>|<01>}|SrC)rErFrGr8rHr+rI)rr;rJrrrr5s 

zTextAudioLoader.get_labelsc Cs<>t|<01>\}}||jkr(td<01>||j<01><02><01>||j}|<04>d<02>}|<01>dd<04>}tj<08> |<05>r<>zt
<EFBFBD> |<05>}Wq<>t |t <0A><0E><00>t||j|j|j|jdd<06>}t
<EFBFBD>|d<02>}t
j||dd<07>Yq<>0n8t||j|j|j|jdd<06>}t
<EFBFBD>|d<02>}t
j||dd<07>||fSrKrNrYrrrr6s@ 
<06><02>

   <02> <02> zTextAudioLoader.get_audiocCs|<00>|j|<00>Sr)r\r]rrrr_0szTextAudioLoader.__getitem__cCs
t|j<01>Sr)r`rarrrrb3szTextAudioLoader.__len__Nrcrrrrr<><00>s "r<>c@s"eZdZdZddd<04>Zdd<06>ZdS) <09>TextAudioCollateriFcCs
||_dSr)rjrlrrrr8szTextAudioCollate.__init__c Cs<>tjt<00>dd<02>|D<00><01>ddd<05>\}}tdd<02>|D<00><01>}tdd<02>|D<00><01>}t<00>t|<01><01>}t<00>t|<01><01>}t<00>t|<01>|dd<00>d<03>|<04>}t<00>t|<01>d|<05>} |<08><07>| <09><07>td d<02>|D<00><01>}
t<00>t|<01><01>} t<00>t|<01>|
|dd
jd<00>} | <0C><07>t<00>t|<01><01>} t t|<03><01>D]<5D>}|||}|d}|||d d <0B>d |<10>d<08><01>f<|<10>d<08>||<|d}|| |d d <0B>d |<11>d<08><01>f<|<11>d<08>||<|d
}|| |d |<12>d<03><01>d d <0B>f<|<12>d<03>| |<|d | |<<00>q| | ||| || fS) rmcSsg|]}|d<00>d<01><01>qSrnrorprrrrsCrtz-TextAudioCollate.__call__.<locals>.<listcomp>rTrucSsg|]}|d<00>d<01><01>qSrnrorprrrrsFrtcSsg|]}|d<00>d<00><01>qSrxrorprrrrsGrtr cSsg|]}|d<00>d<01><01>qSryrorprrrrsOrtrNr1r{)rr<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r/r<>r<>r<r<>r;rrrr<>;sH<16>
   <02>zTextAudioCollate.__call__N)Fr<46>rrrrr<>5s
r<>csDeZdZdZd<0F>fdd<05> Zdd<07>Zdd <09>Zdd d <0C>Zd d<0E>Z<08>Z S)<11>DistributedBucketSamplera<72>
Maintain similar input lengths in a batch.
Length groups are specified by boundaries.
Ex) boundaries = [b1, b2, b3] -> any batch is included either {x | b1 < length(x) <=b2} or {x | b2 < length(x) <= b3}.
It removes samples which are not included in the boundaries.
Ex) boundaries = [b1, b2, b3] -> any x s.t. length(x) <= b1 or length(x) > b3 are discarded.
NTcsVt<00>j||||d<01>|j|_||_||_|<00><05>\|_|_t|j<07>|_ |j |j
|_ dS)N)<03> num_replicas<61>rank<6E>shuffle) <0C>superrr"<00>
batch_size<EFBFBD>
boundaries<EFBFBD>_create_buckets<74>buckets<74>num_samples_per_bucket<65>sum<75>
total_sizer<EFBFBD><00> num_samples)r<00>datasetr<74>r<>r<>r<>r<><00><01> __class__rrr{s  z!DistributedBucketSampler.__init__c Cs<>dd<02>tt|j<02>d<00>D<00>}tt|j<03><01>D].}|j|}|<00>|<03>}|dkr*||<00>|<02>q*tt|<01>ddd<04>D].}t||<00>dkrn|<01>|<02>|j<02>|d<00>qng}tt|<01><01>D]:}t||<00>}|j|j}||||}|<05>||<00>q<>||fS)NcSsg|]}g<00>qSrr)rqr<>rrrrs<00>rtz<DistributedBucketSampler._create_buckets.<locals>.<listcomp>r r3r) rrr<>r"<00>_bisectr<00>popr<70>r<>) rr<>r<><00>lengthZ
idx_bucketr<EFBFBD><00>
len_bucketZtotal_batch_size<7A>remrrrr<><00>s&


  
<02>z(DistributedBucketSampler._create_bucketsc s<>t<00><01>}|<01>|j<03>g}|jrH|jD] <20>|<02>tjt<08><01>|d<01><02> <09><00>q$n"|jD]<1A>|<02>t
t t<08><01><01><01><01>qNg<00>t t|j<05><01>D]<5D>}|j|<00>t<08><01>}||}|j |}||}|||||d||<00>}||j d|j<0E>}t t|<05>|j<00>D]8}<08>fdd<03>|||j|d|j<00>D<00>} <09><00>| <09>q<>q||j<04>r\tjt<08><00>|d<01><02> <09>}
<EFBFBD>fdd<03>|
D<00><01><00>|_t|j<10>|j|jk<02>s~J<00>t|j<10>S)N)<01> generatorcsg|] }<01>|<00>qSrr)rq<00>idx)<01>bucketrrrs<00>s<02>z5DistributedBucketSampler.__iter__.<locals>.<listcomp>r csg|] }<01>|<00>qSrr)rqr<>)<01>batchesrrrs<00>rt)r+<00> Generator<6F> manual_seed<65>epochr<68>r<>r<00>randpermr<00>tolist<73>listrr<>r<>r<>r<>r<>r<><00>iter) r<00>g<>indicesr<73>r<>Z
ids_bucketZnum_samples_bucketr<74><00>jr<6A>Z batch_idsr)r<>r<>r<00>__iter__<5F>sF 
 



<EFBFBD><0E><02>
<16><02>z!DistributedBucketSampler.__iter__rcCs<>|durt|j<01>d}||kr~||d}|j||krN||j|dkrN|S||j|krj|<00>|||<04>S|<00>||d|<03>SndSdS)Nr rr3)rr<>r<>)rrr<00>lo<6C>hi<68>midrrrr<><00>s  z DistributedBucketSampler._bisectcCs |j|jSr))r<>r<>rarrrrb<00>sz DistributedBucketSampler.__len__)NNT)rN)
rdrerfrgrr<>r<>r<>rb<00> __classcell__rrr<>rr<>qs<00>1
r<>)rrU<00>numpyrEr+<00>torch.utils.data<74>mel_processingr<00>utilsrr<00>data<74>Datasetrrhr<>r<><00> distributed<65>DistributedSamplerr<72>rrrr<00><module>s yJi<