Automated testing of broadcast systems generates large volumes of test logs containing failure reports that require manual analysis. When multiple tests fail due to common root causes, redundant investigation by different engineers wastes time and resources. This paper evaluates unsupervised machine learning techniques—specifically KMeans and HDBSCAN clustering algorithms—for automatically grouping failed test runs in the MediorNet product line at RIEDEL Communications Austria. The system processes semi-structured HTML test log files to extract composite feature vectors combining log text, test descriptions, headings, and metadata. We systematically evaluate the impact of text extraction strategies (extracting first error only versus all), feature vectorization approaches (BERT embeddings, TF-IDF, and hybrid representations) and component weighting schemes on clustering algorithm, namely KMeans and HDBSCAN, performance. Evaluation on ≈10,000 failed test runs across 113 labeled test campaigns demonstrates that feature vectorization strategy is the primary determinant of vector quality, with TF-IDF achieving higher group separation and BERT providing better cohesion. Both KMeans and HDBSCAN achieve comparable external validation scores (ARI: 0.60±.22 vs. 0.57±0.21), though HDBSCAN exhibits better alignment between internal and external metrics, as well as better suitability to expected single-member cluster scenarios, making it preferable for production deployment. Golden sample evaluation confirms practical feasibility the presented approach, demonstrating maturity for industrial deployment in broadcast equipment testing workflows.
Publication Date: 2026-06-17