The key focus of WGAN is to have a smoother gradient everywhere. Sigmoid function seems going in the other direction. The current focus is on a simple implementation that fulfills the Lipschitz constraint. Whether WGAN is the right direction to go for GAN? This is another question need to be answered.
Can I add sigmoid? If with or without it produce the same accuracy, then probably no. If it improves, why not. But in the latter case, if it holds true for many other cases, then that may be something interests to look into since it is kind of unexpected.