The VoxCeleb2 Dataset


VoxCeleb2 contains over 1 million utterances for 6,112 celebrities, extracted from videos uploaded to YouTube. The development set of VoxCeleb2 has no overlap with the identities in the VoxCeleb1 or SITW datasets.

devtest
# of speakers5,994118
# of videos145,5694,911
# of utterances1,092,00936,237

Click here to be redirected to the VoxCeleb1 dataset. If you require text annotation (e.g. for audio-visual speech recognition), also consider using the LRS dataset.


Downloads


Terms and Conditions

The VoxCeleb2 dataset consists of Youtube URLs with timestamps for utterances. For privacy issues with the dataset, please refer to our Dataset Privacy Notice.

The provided VoxCeleb2 metadata is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

URLs and timestamps

The URLs and timestamps for the VoxCeleb 2 dataset are no longer available from this website.

Audio files

The audio files for the VoxCeleb 2 dataset are no longer available from this website.

Video files

The video files for the VoxCeleb 2 dataset are no longer available from this website.

Metadata

The identifying metadata files for the VoxCeleb 2 dataset are no longer available from this website.

Verification Set
List of trial pairs - VoxCeleb1
List of trial pairs - VoxCeleb1 (cleaned)
List of trial pairs - VoxCeleb1-H
List of trial pairs - VoxCeleb1-H (cleaned)
List of trial pairs - VoxCeleb1-E
List of trial pairs - VoxCeleb1-E (cleaned)

Related Links
Download script and unofficial baseline code can be found here.

Models
Models trained for speaker verification can be found here.

Please cite the following if you make use of the dataset.

J. S. Chung*, A. Nagrani*, A. Zisserman  
INTERSPEECH, 2018.