Adversarial Samples from "Adversarial Attack on GMM i-vector based Speaker Verification Systems"

  • Authors: Xu Li, Jinghua Zhong, Xixin Wu, Jianwei Yu, Xunying Liu and Helen Meng
  • Abstract: This work investigates the vulnerability of Gaussian Mixture Model (GMM) i-vector based speaker verification (SV) systems to adversarial attacks, and the transferability of adversarial samples crafted from GMM i-vector based systems to x-vector based systems. In detail, we formulate the GMM i-vector based system as a scoring function, and leverage the fast gradient sign method (FSGM) to generate adversarial samples through this function. These adversarial samples are used to attack both GMM i-vector and x-vector based systems. We measure the vulnerability of the systems by the degradation of equal error rate and false acceptance rate. Experimental results show that GMM i-vector based systems are seriously vulnerable to adversarial attacks, and the generated adversarial samples are proved to be transferable and pose threats to neural network speaker embedding based systems (e.g. x-vector systems).

  • System Description

  • LPMS-ivec: Log power magnitude spectrum based i-vector SV system
  • MFCC-ivec: Mel-frequency cepstral coefficients based i-vector SV system
  • MFCC-xvec: Mel-frequency cepstral coefficients based x-vector SV system

  • Spoofing audios (generated from LPMS-ivec) under different perturbation degree ε, and the corresponding system response

    TR: true rejection; FA: false acceptance

    ε = 0.3, 1.0, 5.0, 10.0

    target (enroll)
    original test (ε = 0) ε = 0.3 ε = 1.0 ε = 5.0 ε = 10.0
    LPMS-ivec
    (white box attack)
    TR FA FA FA TR
    MFCC-ivec
    (cross feature)
    TR TR TR FA FA
    MFCC-xvec
    (cross both)
    TR TR TR TR TR

    ε = 20.0, 30.0, 50.0

    target (enroll)
    original test (ε = 0) ε = 20.0 ε = 30.0 ε = 50.0
    LPMS-ivec
    (white box attack)
    TR -- -- --
    MFCC-ivec
    (cross feature)
    TR FA FA TR
    MFCC-xvec
    (cross both)
    TR FA TR TR