Also, due to the comparatively small plane wave cutoffs, you will have small but significant modulations of the density in especially in regions with little electron density. These lead to ``strange'' effects with gradient corrected functionals, causing the optimization to fail. To avoid this, you can skip the calculation of the gradient correction for low electron density areas using GC-CUTOFF with a value between 1.D-6 and 1.D-5 in the &DFT section.
In case of geometry optimizations, the accurate calculation of the forces due to the augmentation charges may need a higher density cutoff and thus a tighter real space grid. This can be achieved by either using a higher plane wave cutoff or via increasing DUAL from the default value of 4.0 to 5.0-6.0 up to 10.0 and/or setting the real space density explicitly via the DENSITY CUTOFF keyword in the &SYSTEM section. For the same reason, these options may be needed to increase energy conservation during molecular dynamics runs. Use these options with care, as they will increase the cpu time and memory requirements significantly und thus can easily take away one of the major advantages of ultra-soft pseudopotentials.