Files
AudioGPT/NeuralSeq/tasks/tts/__pycache__/fs2_utils.cpython-38.pyc

49 lines
6.9 KiB
Plaintext
Raw Normal View History

2023-03-20 15:43:44 +08:00
U
2023-03-24 17:19:37 +08:00
<00><>dS<00>@s<>ddlZe<00>d<02>ddlZddlZddlmZddlZddlZddl Zddl
2023-03-20 15:43:44 +08:00
m Z ddl m Z ddlZddlmZddlZddlZddl ZddlZddlZddlmZGdd <09>d e<11>ZdS)
<EFBFBD>N<>Agg)<01> get_lf0_cwt)<01>IndexedDataset)<01>norm_interp_f0)<01> BaseDataset)<01>hparamscs@eZdZd <0A>fdd<03> Zdd<05>Zdd<07>Zdd <09>Zdd d <0C>Z<07>ZS)<0F>FastSpeechDatasetFcsPt<00><00>|<02>td<00>_|<01>_t<02>_t<05><06>j<03>d<02>j<04>d<03><04><01>_d<00>_<08>j<03>d<04>}t j
<EFBFBD> |<03>r<>t<05>|<03>\td<td<\<02>_ <0C>_ ttd<00>td<ttd<00>td<nd\td<td<\<02>_ <0C>_ |dk<02>r(td d
2023-03-24 17:19:37 +08:00
kr<><72><00>td <00>\<02>_<08>_n>td d k<04>r(tttd <00><01>td <00>_<12>fdd<0F><08>jD<00><01>_tddk<02>rLtt<05>d<12><01>\}td<dS)N<>binary_data_dir<69>/z _lengths.npyz/train_f0s_mean_std.npy<70>f0_mean<61>f0_std)NN<4E>test<73>test_input_dir<69>Znum_test_samplesrZtest_idscsg|]}<01>j|<00>qS<00>)<01>sizes)<02>.0<EFBFBD>i<><01>selfr<00>S/mnt/sdc/hongzhiqing/github/AudioGPT/text_to_sing/DiffSinger/tasks/tts/fs2_utils.py<70>
2023-03-20 15:43:44 +08:00
<listcomp>0sz.FastSpeechDataset.__init__.<locals>.<listcomp><3E>
pitch_type<EFBFBD>cwt<77>
<00>
2023-03-24 17:19:37 +08:00
cwt_scales)<15>super<65>__init__r<00>data_dir<69>prefix<69>np<6E>loadr<00>
indexed_ds<EFBFBD>os<6F>path<74>existsr r <00>float<61>load_test_inputs<74>list<73>range<67>
avail_idxsr<00>ones)rr<00>shuffle<6C> f0_stats_fn<66>_<><01> __class__rrrs( 
2023-03-20 15:43:44 +08:00
  "
2023-03-24 17:19:37 +08:00
 zFastSpeechDataset.__init__cCsJt|d<01>r|jdk r|j|}|jdkr@t|j<04>d|j<05><00><03>|_|j|S)Nr*r
)<06>hasattrr*r"rrr)r<00>indexrrr<00> _get_item5s
2023-03-20 15:43:44 +08:00


zFastSpeechDataset._get_itemc Cs<>|j}|<00>|<01>}|d}t<02>|d<00>d|<04>}|<05><04>d<00>d<04><01><06>}d|krbt<02>|d<00>d|<04>nd}t|dd|<04>|<02>\}} t<02>|dd|d<00><00>}
t<02>|<03> d <09><01>d|<04>} ||d
|d |
|| ||| ||<05>
<EFBFBD><00>d<04>d kd <0A> } |jd<00>rt<02>|d<00>| d<|jd<00>r|d| d<|jddk<02>rxt<02>|d<00>d|<04>} |<03> d|<03> d<16><01>}|<03> d|<03> d<18><01>}| <0C> | ||d<19><03>n`|jddk<02>r<>t<02> |
<EFBFBD><01> <0A><00>d |d|<08>}t<02> |
<EFBFBD><01> <0A><00>d |dt<02>|<08><01><03>d<1B>}||| d<| S)N<>
2023-03-24 17:19:37 +08:00
max_frames<EFBFBD>mel<65><00><><EFBFBD><EFBFBD><EFBFBD><EFBFBD>mel2ph<70>f0<66>phoneZmax_input_tokens<6E>pitch<63> item_name<6D>txtr) <0B>idr<<00>text<78> txt_tokenr5r;<00>energyr9<00>uvr8Zmel_nonpadding<6E> use_spk_embed<65> spk_embed<65>
use_spk_id<EFBFBD>spk_idrr<00>cwt_specr Zcwt_meanr Zcwt_std<74>rGr r <00>ph<70><00>f0_ph)rr3<00>torch<63>Tensor<6F>exp<78>sum<75>sqrt<72>
2023-03-20 15:43:44 +08:00
LongTensorr<00>get<65>abs<62>update<74>
2023-03-24 17:19:37 +08:00
zeros_liker&<00> scatter_add<64> ones_like<6B> clamp_min)rr2r<00>itemr4<00>specrAr8r9rBr:r;<00>samplerGr r Zf0_phlevel_sumZf0_phlevel_numrrr<00> __getitem__<sR
2023-03-20 15:43:44 +08:00
"<10>   <08><02> zFastSpeechDataset.__getitem__cCst|<01>dkriSt<01>dd<03>|D<00><01>}dd<03>|D<00>}dd<03>|D<00>}t<03>dd<03>|D<00>d<01>}t<03>dd<03>|D<00>d<08>}t<03>d d<03>|D<00><01>}t<03>d
d<03>|D<00><01>}t<03>d d<03>|D<00>d<08>} |dd dk r<>t<03>d d<03>|D<00>d<08>nd}
t<03>dd<03>|D<00>d<08>} t<01>dd<03>|D<00><01>} t<01>dd<03>|D<00><01>} ||t|<01>||| | | |
2023-03-24 17:19:37 +08:00
| |||d<11> }|jd<00>r^t<01>dd<03>|D<00><01>}||d<|jd<00>r<>t<01>dd<03>|D<00><01>}||d<|jddk<02>r<>t<03>dd<03>|D<00><01>}t<01>dd<03>|D<00><01>}t<01>dd<03>|D<00><01>}|<0E> |||d<1D><03>n(|jddk<02>rt<03>dd<03>|D<00><01>|d <|S)!NrcSsg|] }|d<00>qS)r>r<00>r<00>srrrrlsz.FastSpeechDataset.collater.<locals>.<listcomp>cSsg|] }|d<00>qS)r<rr]rrrrmscSsg|] }|d<00>qS)r?rr]rrrrnscSsg|] }|d<00>qS<00>r@rr]rrrroscSsg|] }|d<00>qS)r9rr]rrrrpsgcSsg|] }|d<00>qS)r;rr]rrrrqscSsg|] }|d<00>qS)rBrr]rrrrrscSsg|] }|d<00>qS)rArr]rrrrssr8cSsg|] }|d<00>qS)r8rr]rrrrtscSsg|] }|d<00>qS)r5rr]rrrrvscSsg|]}|d<00><00><00>qSr_)<01>numelr]rrrrwscSsg|]}|djd<00>qS)r5r)<01>shaper]rrrrxs) r>r<<00>nsamplesr?<00>
txt_tokens<EFBFBD> txt_lengths<68>mels<6C> mel_lengthsr8rAr;r9rBrCcSsg|] }|d<00>qS)rDrr]rrrr<00>srDrEcSsg|] }|d<00>qS)rFrr]rrrr<00>s<00>spk_idsrrcSsg|] }|d<00>qS)rGrr]rrrr<00>scSsg|] }|d<00>qS)r rr]rrrr<00>scSsg|] }|d<00>qS)r rr]rrrr<00>srHrIcSsg|] }|d<00>qS)rKrr]rrrr<00>sr9)
<EFBFBD>lenrLrQ<00>utils<6C>
2023-03-20 15:43:44 +08:00
collate_1d<EFBFBD>
2023-03-24 17:19:37 +08:00
collate_2dr<00>stackrMrT)r<00>samplesr><00>
item_namesr?rcr9r;rBrAr8rerdrf<00>batchrDrgrGr r rrr<00>collaterisZ <0E><02><02>  zFastSpeechDataset.collaterrc Cs<>t<00>|<01>d<01><02>t<00>|<01>d<02><02>}g}g}t<01>dd<04>}d<05>|<06>d<05>dd<06><00>}|<06>d<05>d}tt<06>|<07>|<08>}td} |D]T}
2023-03-20 15:43:44 +08:00
tj <09>
|
<EFBFBD>} d} } }|
}
d}|<06> | | | ||
||| <09>}|<05> |<10>|<04> |d <00>qv||fS)
2023-03-24 17:19:37 +08:00
Nz/*.wavz/*.mp3<70> binarizer_clsz*data_gen.tts.base_binarizerr.BaseBinarizer<65>.r7<00>binarization_argsrrh) <0A>globrrR<00>join<69>split<69>getattr<74> importlib<69> import_moduler#r$<00>basenameZ process_item<65>append)rrrFZ inp_wav_pathsr<00>itemsrq<00>pkg<6B>cls_namers<00>wav_fnr<rIr=<00>tg_fn<66>encoderrYrrrr'<00>s"    
2023-03-20 15:43:44 +08:00
z"FastSpeechDataset.load_test_inputs)F)r) <09>__name__<5F>
2023-03-24 17:19:37 +08:00
__module__<EFBFBD> __qualname__rr3r\rpr'<00> __classcell__rrr/rrs
2023-03-20 15:43:44 +08:00
-1r)<17>
2023-03-24 17:19:37 +08:00
matplotlib<EFBFBD>usertrx<00> utils.cwtrr#<00> torch.optimrL<00>torch.utils.dataZutils.indexed_datasetsr<00>utils.pitch_utilsr<00>numpyr Ztasks.base_taskrri<00>torch.distributions<6E> utils.hparamsrrrrrr<00><module>s$
2023-03-20 15:43:44 +08:00