To address the issue that the performance of recent speaker diarization systems degrades when speaker durations are imbalanced, a speaker diarization system is designed using adversarial learning and short-phrase prior. In the speaker data aspect, under the short-phrase prior, the proposed method applies imbalanced data sampling to speakers with different durations, minimizing the speech duration gap among different speakers. For speaker representation extraction and clustering, a training scheme is designed to enhance the separability of clusters after imbalanced data sampling and to maintain similarity in cluster distribution compared to balanced data. To avoid data sparsity problem, adversarial learning is utilized to transfer the optimization process to a lower-dimensional embedding space. During the inference, the proposed method constrains the consistency of clustering results from replicas augmented in different acoustic environments. Compared to existing methods, the proposed approach achieves a DER reduction of 6.15% and 4.27% on imbalanced duration subsets of the VoxConverse dataset and the AISHELL-4 dataset, respectively (The relative reduction is 22.2% and 21.7% correspondingly). The result indicates that the proposed method is a practical approach for mitigating speaker diarization system performance degradation in scenarios with imbalanced speaker durations.