adversarial-samples-attacking-ASV-systems

Adversarial Samples from "Adversarial Attack on GMM i-vector based Speaker Verification Systems"

Authors: Xu Li, Jinghua Zhong, Xixin Wu, Jianwei Yu, Xunying Liu and Helen Meng

Abstract: This work investigates the vulnerability of Gaussian Mixture Model (GMM) i-vector based speaker verification (SV) systems to adversarial attacks, and the transferability of adversarial samples crafted from GMM i-vector based systems to x-vector based systems. In detail, we formulate the GMM i-vector based system as a scoring function, and leverage the fast gradient sign method (FSGM) to generate adversarial samples through this function. These adversarial samples are used to attack both GMM i-vector and x-vector based systems. We measure the vulnerability of the systems by the degradation of equal error rate and false acceptance rate. Experimental results show that GMM i-vector based systems are seriously vulnerable to adversarial attacks, and the generated adversarial samples are proved to be transferable and pose threats to neural network speaker embedding based systems (e.g. x-vector systems).

System Description

LPMS-ivec: Log power magnitude spectrum based i-vector SV system

MFCC-ivec: Mel-frequency cepstral coefficients based i-vector SV system

MFCC-xvec: Mel-frequency cepstral coefficients based x-vector SV system

Spoofing audios (generated from LPMS-ivec) under different perturbation degree ε, and the corresponding system response

TR: true rejection; FA: false acceptance

ε = 0.3, 1.0, 5.0, 10.0

target (enroll)

	original test (ε = 0)	ε = 0.3	ε = 1.0	ε = 5.0	ε = 10.0

LPMS-ivec (white box attack)	TR	FA	FA	FA	TR
MFCC-ivec (cross feature)	TR	TR	TR	FA	FA
MFCC-xvec (cross both)	TR	TR	TR	TR	TR

ε = 20.0, 30.0, 50.0

target (enroll)

	original test (ε = 0)	ε = 20.0	ε = 30.0	ε = 50.0

LPMS-ivec (white box attack)	TR	--	--	--
MFCC-ivec (cross feature)	TR	FA	FA	TR
MFCC-xvec (cross both)	TR	FA	TR	TR