4  Results

We summarize cross-validated training performance of class metrics in the training set. The accuracy, F1-score, and kappa, are the metrics of interest. Workflows are ordered by their mean estimates across the outer folds of the nested CV for each metric.

4.1 Training Set

4.2 Rank Aggregation

Multi-step methods:

  • sequential: sequential algorithm sequence of subsampling methods and algorithms used are:
    • HGSOC vs. non-HGSOC using upsampling and XGBoost
    • CCOC vs. non-CCOC using SMOTE subsampling and XGBoost
    • ENOC vs. non-ENOC using no subsampling and support vector machine
    • MUOC vs. LGSOC using SMOTE subsampling and random forest
  • two_step: two-step algorithm sequence of subsampling methods and algorithms used are:
    • HGSOC vs. non-HGSOC using upsampling and XGBoost
    • CCOC vs. ENOC vs. MUOC vs. LGSOC using SMOTE subsampling and support vector machine

We conduct rank aggregation using a two-stage nested approach:

  1. First we rank aggregate the per-class metrics for F1-score, balanced accuracy and kappa.
  2. Then we take the aggregated lists from the three metrics and perform a final rank aggregation.
  3. The top workflows from the final rank aggregation are used for gene optimization in the confirmation set

4.2.1 Across Classes

4.2.2 Across Metrics

Table 4.10: Rank Aggregation Comparison of Metrics Used in Training Set
Rank F1 Balanced Accuracy Kappa
1 sequential sequential sequential
2 two_step hybrid_xgb two_step
3 up_rf up_mr smote_svm
4 smote_svm smote_mr up_rf
5 hybrid_xgb smote_xgb up_xgb
6 up_xgb two_step hybrid_xgb
7 hybrid_svm hybrid_mr hybrid_svm
8 hybrid_rf up_xgb hybrid_rf
9 smote_xgb hybrid_svm smote_xgb
10 smote_mr smote_svm up_svm
11 up_svm down_mr smote_mr
12 smote_rf hybrid_rf smote_rf
13 none_rf up_svm none_svm
14 hybrid_mr up_rf none_rf
15 up_mr smote_rf hybrid_mr
16 down_mr down_rf up_mr
17 down_svm down_xgb down_mr
18 down_rf none_svm down_rf
19 down_xgb none_rf down_xgb
20 NA down_svm down_svm
21 NA none_mr none_mr
22 NA none_xgb none_xgb
Table 4.11: Top 5 Workflows from Final Rank Aggregation
Rank Workflow
1 sequential
2 two_step
3 up_rf
4 smote_svm
5 hybrid_xgb

4.2.3 Top Workflows

We look at the per-class evaluation metrics of the top 5 workflows.

Table 4.12: Top Workflow Per-Class Evaluation Metrics
Histotypes
Metric Workflow HGSOC CCOC ENOC MUOC LGSOC
Accuracy Sequential 0.96 (0.95, 0.96) 0.92 (0.88, 0.96) 0.86 (0.77, 0.91) 0.95 (0.88, 1) 0.95 (0.88, 1)
2-STEP 0.96 (0.95, 0.96) 0.93 (0.9, 0.96) 0.84 (0.73, 0.9) 0.89 (0.81, 0.96) 0.97 (0.94, 1)
Up-RF 0.95 (0.92, 0.98) 0.98 (0.97, 0.99) 0.96 (0.95, 0.97) 0.98 (0.96, 0.99) 0.98 (0.97, 0.99)
SMOTE-SVM 0.95 (0.94, 0.97) 0.98 (0.97, 0.99) 0.95 (0.94, 0.96) 0.98 (0.96, 0.98) 0.98 (0.97, 0.98)
Hybrid-XGB 0.95 (0.94, 0.97) 0.98 (0.97, 0.99) 0.96 (0.94, 0.97) 0.97 (0.96, 0.98) 0.98 (0.97, 0.99)
Sensitivity Sequential 0.98 (0.97, 0.99) 0.88 (0.81, 1) 0.86 (0.81, 0.94) 0.97 (0.91, 1) 0.92 (0.6, 1)
2-STEP 0.98 (0.97, 0.99) 0.87 (0.82, 0.93) 0.73 (0.56, 0.82) 0.84 (0.75, 0.92) 0.77 (0, 1)
Up-RF 0.99 (0.98, 1) 0.79 (0.57, 0.88) 0.64 (0.44, 0.81) 0.77 (0.46, 0.91) 0.18 (0, 0.33)
SMOTE-SVM 0.97 (0.96, 0.98) 0.76 (0.57, 0.85) 0.68 (0.5, 0.86) 0.75 (0.54, 0.91) 0.64 (0.5, 0.75)
Hybrid-XGB 0.97 (0.96, 0.98) 0.85 (0.57, 0.95) 0.62 (0.44, 0.77) 0.82 (0.54, 0.91) 0.59 (0.38, 0.75)
Specificity Sequential 0.88 (0.85, 0.9) 0.94 (0.85, 1) 0.85 (0.73, 0.88) 0.92 (0.6, 1) 0.97 (0.91, 1)
2-STEP 0.88 (0.85, 0.9) 0.97 (0.94, 1) 0.89 (0.83, 0.94) 0.91 (0.83, 0.97) 0.98 (0.95, 1)
Up-RF 0.82 (0.74, 0.89) 1 (0.99, 1) 0.98 (0.97, 0.99) 0.99 (0.98, 1) 1 (0.99, 1)
SMOTE-SVM 0.88 (0.8, 0.92) 0.99 (0.99, 1) 0.97 (0.96, 0.98) 0.99 (0.98, 1) 0.99 (0.98, 1)
Hybrid-XGB 0.9 (0.85, 0.95) 0.99 (0.99, 1) 0.98 (0.97, 0.99) 0.98 (0.97, 0.99) 0.99 (0.98, 0.99)
F1-Score Sequential 0.97 (0.97, 0.98) 0.88 (0.81, 0.94) 0.86 (0.79, 0.91) 0.97 (0.92, 1) 0.91 (0.75, 1)
2-STEP 0.97 (0.97, 0.98) 0.9 (0.85, 0.93) 0.75 (0.61, 0.85) 0.79 (0.67, 0.92) 0.73 (0, 1)
Up-RF 0.97 (0.95, 0.99) 0.85 (0.7, 0.94) 0.65 (0.47, 0.79) 0.76 (0.57, 0.87) 0.36 (0.29, 0.4)
SMOTE-SVM 0.97 (0.96, 0.98) 0.81 (0.67, 0.92) 0.64 (0.5, 0.81) 0.75 (0.58, 0.83) 0.52 (0.36, 0.71)
Hybrid-XGB 0.97 (0.96, 0.98) 0.86 (0.67, 0.95) 0.64 (0.5, 0.81) 0.72 (0.61, 0.8) 0.51 (0.44, 0.67)
Balanced Accuracy Sequential 0.93 (0.91, 0.94) 0.91 (0.86, 0.95) 0.86 (0.77, 0.91) 0.94 (0.8, 1) 0.94 (0.8, 1)
2-STEP 0.93 (0.91, 0.94) 0.92 (0.88, 0.95) 0.81 (0.69, 0.88) 0.87 (0.79, 0.95) 0.87 (0.48, 1)
Up-RF 0.9 (0.86, 0.94) 0.89 (0.78, 0.94) 0.81 (0.71, 0.9) 0.88 (0.73, 0.95) 0.59 (0.5, 0.66)
SMOTE-SVM 0.92 (0.89, 0.95) 0.88 (0.78, 0.92) 0.83 (0.74, 0.92) 0.87 (0.76, 0.95) 0.81 (0.75, 0.87)
Hybrid-XGB 0.94 (0.91, 0.96) 0.92 (0.78, 0.97) 0.8 (0.72, 0.88) 0.9 (0.76, 0.95) 0.79 (0.68, 0.87)
Kappa Sequential 0.86 (0.83, 0.88) 0.82 (0.72, 0.91) 0.71 (0.55, 0.82) 0.88 (0.68, 1) 0.88 (0.68, 1)
2-STEP 0.86 (0.83, 0.88) 0.85 (0.77, 0.9) 0.63 (0.4, 0.77) 0.72 (0.54, 0.9) 0.72 (-0.03, 1)
Up-RF 0.85 (0.77, 0.9) 0.84 (0.68, 0.93) 0.63 (0.45, 0.77) 0.75 (0.55, 0.86) 0.21 (0, 0.4)
SMOTE-SVM 0.84 (0.78, 0.88) 0.8 (0.65, 0.91) 0.61 (0.48, 0.79) 0.74 (0.56, 0.83) 0.52 (0.35, 0.71)
Hybrid-XGB 0.85 (0.83, 0.89) 0.85 (0.65, 0.95) 0.62 (0.48, 0.79) 0.7 (0.59, 0.79) 0.51 (0.44, 0.66)
Figure 4.7: Top 5 Workflow Per-Class Evaluation Metrics by Metric
Table 4.13: Top Workflow Per-Class Evaluation Metrics and Ranks
Workflow Rank HGSOC CCOC ENOC MUOC LGSOC
F1-Score
Sequential 1 0.973 0.877 0.860 0.966 0.910
2-STEP 2 0.973 0.899 0.755 0.788 0.733
Up-RF 3 0.972 0.849 0.652 0.757 0.362
SMOTE-SVM 4 0.969 0.811 0.638 0.752 0.524
Hybrid-XGB 5 0.971 0.860 0.643 0.716 0.515
Balanced Accuracy
Sequential 1 0.926 0.907 0.856 0.943 0.943
Hybrid-XGB 2 0.935 0.922 0.802 0.899 0.791
2-STEP 6 0.926 0.917 0.814 0.875 0.872
SMOTE-SVM 10 0.923 0.875 0.826 0.869 0.814
Up-RF 14 0.905 0.894 0.808 0.878 0.590
Kappa
Sequential 1 0.858 0.815 0.712 0.877 0.877
2-STEP 2 0.858 0.850 0.635 0.716 0.718
SMOTE-SVM 3 0.839 0.800 0.614 0.741 0.516
Up-RF 4 0.850 0.840 0.630 0.746 0.213
Hybrid-XGB 6 0.855 0.852 0.620 0.702 0.506
Figure 4.8: Top 5 Workflow Per-Class Evaluation Metrics by Metric

Misclassified cases from a previous step of the sequence of classifiers are not included in subsequent steps of the training set CV folds. Thus, we cannot piece together the test set predictions from the sequential and two-step algorithms to obtain overall metrics.

4.3 Confirmation Set

Now we’d like to see how our best five workflows perform in the confirmation set. The class-specific F1-scores will be used. The top performing method will be selected for gene optimization.

Table 4.14: Evaluation Metrics on Confirmation Set Models
Histotypes
Method Metric Overall HGSOC CCOC ENOC MUOC LGSOC
Sequential Accuracy 0.829 0.866 0.969 0.883 0.966 0.974
Sensitivity 0.584 0.953 0.861 0.467 0.556 0.083
Specificity 0.924 0.697 0.982 0.966 0.984 0.990
F1-Score 0.604 0.904 0.861 0.571 0.577 0.105
Balanced Accuracy 0.754 0.825 0.922 0.717 0.770 0.537
Kappa 0.646 0.685 0.844 0.508 0.559 0.093
2-STEP Accuracy 0.838 0.866 0.970 0.891 0.977 0.972
Sensitivity 0.613 0.953 0.875 0.486 0.667 0.083
Specificity 0.926 0.697 0.982 0.972 0.990 0.989
F1-Score 0.635 0.904 0.869 0.598 0.706 0.100
Balanced Accuracy 0.769 0.825 0.929 0.729 0.828 0.536
Kappa 0.666 0.685 0.852 0.538 0.694 0.086
Up-RF Accuracy 0.835 0.857 0.975 0.883 0.974 0.981
Sensitivity 0.613 0.972 0.875 0.383 0.667 0.167
Specificity 0.918 0.633 0.988 0.983 0.987 0.997
F1-Score 0.648 0.900 0.887 0.522 0.679 0.250
Balanced Accuracy 0.765 0.802 0.931 0.683 0.827 0.582
Kappa 0.646 0.654 0.873 0.466 0.665 0.243
SMOTE-SVM Accuracy 0.827 0.866 0.958 0.888 0.972 0.970
Sensitivity 0.650 0.939 0.861 0.477 0.556 0.417
Specificity 0.927 0.725 0.970 0.970 0.990 0.981
F1-Score 0.656 0.902 0.821 0.586 0.625 0.345
Balanced Accuracy 0.788 0.832 0.916 0.723 0.773 0.699
Kappa 0.651 0.690 0.797 0.525 0.611 0.330
Hybrid-XGB Accuracy 0.830 0.869 0.961 0.893 0.967 0.970
Sensitivity 0.662 0.943 0.861 0.458 0.630 0.417
Specificity 0.928 0.725 0.974 0.979 0.982 0.981
F1-Score 0.657 0.905 0.832 0.587 0.618 0.345
Balanced Accuracy 0.795 0.834 0.917 0.719 0.806 0.699
Kappa 0.657 0.696 0.810 0.531 0.601 0.330
Figure 4.9: Evaluation Metrics on Confirmation Set Models
Figure 4.10: Entropy vs. Predicted Probability in Confirmation Set
Figure 4.11: Gene Optimized Workflows Per-Class Metrics in Confirmation Set
Figure 4.12: Confusion Matrices for Confirmation Set Models

4.4 Gene Optimization

From Figure 4.9, we see that both Hybrid-XGB and SMOTE-SVM have the highest overall F1-Score and are relatively better at predicting the rarest histotype LGSOC (sensitivity = 0.42). Thus we choose both of these two workflows for gene optimization in the confirmation set. The optimal number of genes is determined by the highest average F1-Score across classes, including the overall metric in the average. We use an F1-score that is averaged across cross-validation folds (5) and class groups (6: Overall, HGSOC, CCOC, ENOC, MUOC, LGSOC) to compare performance between different number of genes selected.

4.4.1 Hybrid-XGB

Figure 4.18: Gene Optimization for Hybrid-XGB Classifier using Averaged F1-Score

In the Hybrid-XGB classifier, the optimal number of genes is highlighted in red in Figure 4.18. Hence the optimal number of total genes used will be n=28+19=47.

The gene profile of the optimal set of genes used is displayed in Table 4.15. Base genes in the PrOTYPE and SPOT sets are annotated with green circles, and the added genes are annotated with yellow circles. The added genes are: HNF1B, TFF1, TPX2, WT1, IGFBP1, LGALS4, TFF3, KLK7, CYP2C18, GPR64, CAPN2, MET, GCNT3, GAD1, SLC3A1, EGFL6, C1orf173, DKK4 and C10orf116. Unused genes are annotated with red crosses.

Table 4.15: Gene Profile of Optimal Set in Hybrid-XGB Workflow
Set Genes PrOTYPE SPOT Optimal Set Candidate Rank
Base COL11A1
CD74
CD2
TIMP3
LUM
CYTIP
COL3A1
THBS2
TCF7L1
HMGA2
FN1
POSTN
COL1A2
COL5A2
PDZK1IP1
FBN1
HIF1A
CXCL10
DUSP4
SOX17
MITF
CDKN3
BRCA2
CEACAM5
ANXA4
SERPINE1
CRABP2
DNAJC9
Candidates HNF1B 1
TFF1 2
TPX2 3
WT1 4
IGFBP1 5
LGALS4 6
TFF3 7
KLK7 8
CYP2C18 9
GPR64 10
CAPN2 11
MET 12
GCNT3 13
GAD1 14
SLC3A1 15
EGFL6 16
C1orf173 17
DKK4 18
C10orf116 19
FUT3 20
PBX1 21
MUC5B 22
KGFLP2 23
IGKC 24
IL6 25
CPNE8 26
CYP4B1 27
TP53 28
PAX8 29
SERPINA5 30
SENP8 31
BRCA1 32
STC1 33
SEMA6A 34
TSPAN8 35
LIN28B 36
EPAS1 37
ATP5G3 38
IGJ 39
SCGB1D2 40
BCL2 41
ADCYAP1R1 42
MAP1LC3A 43

4.4.2 SMOTE-SVM

Figure 4.19: Gene Optimization for SMOTE-SVM Classifier using Averaged F1-Score

In the SMOTE-svm classifier, the optimal number of genes is achieved at the highest averaged F1-score with 41 genes added, highlighted in red in Figure 4.19. Hence the optimal number of total genes used will be n=28+41=69.

The gene profile of the optimal set of genes used is displayed in Table 4.16. Base genes in the PrOTYPE and SPOT sets are annotated with green circles, and the added genes are annotated with yellow circles. The added genes are: EGFL6, IGJ, IGKC, TP53, DKK4, MUC5B, SLC3A1, MAP1LC3A, IGFBP1, CPNE8, SERPINA5, SCGB1D2, STC1, EPAS1, BRCA1, KGFLP2, SENP8, BCL2, PBX1, KLK7, C10orf116, LIN28B, LGALS4, ADCYAP1R1, IL6, ZBED1, WT1, TFF1, GCNT3, HNF1B, TFF3, CYP4B1, CYP2C18, TSPAN8, FUT3, MET, ATP5G3, SEMA6A, GPR64, PAX8 and C1orf173. Unused genes are annotated with red crosses.

Table 4.16: Gene Profile of Optimal Set in SMOTE-SVM Workflow
Set Genes PrOTYPE SPOT Optimal Set Candidate Rank
Base COL11A1
CD74
CD2
TIMP3
LUM
CYTIP
COL3A1
THBS2
TCF7L1
HMGA2
FN1
POSTN
COL1A2
COL5A2
PDZK1IP1
FBN1
HIF1A
CXCL10
DUSP4
SOX17
MITF
CDKN3
BRCA2
CEACAM5
ANXA4
SERPINE1
CRABP2
DNAJC9
Candidates EGFL6 1
IGJ 2
IGKC 3
TP53 4
DKK4 5
MUC5B 6
SLC3A1 7
MAP1LC3A 8
IGFBP1 9
CPNE8 10
SERPINA5 11
SCGB1D2 12
STC1 13
EPAS1 14
BRCA1 15
KGFLP2 16
SENP8 17
BCL2 18
PBX1 19
KLK7 20
C10orf116 21
LIN28B 22
LGALS4 23
ADCYAP1R1 24
IL6 25
ZBED1 26
WT1 27
TFF1 28
GCNT3 29
HNF1B 30
TFF3 31
CYP4B1 32
CYP2C18 33
TSPAN8 34
FUT3 35
MET 36
ATP5G3 37
SEMA6A 38
GPR64 39
PAX8 40
C1orf173 41
GAD1 42
CAPN2 43

4.4.3 Gene List Comparisons in Confirmation Set

We train the Hybrid-XGB and SMOTE-SVM workflows using the base and optimal gene lists in the training set. The models are evaluated on the confirmation set. Overall and per-class results are shown in Table 4.17. The gene lists are:

  1. Base (n=28): among the overlapping genes, the base set from the PrOTYPE and SPOT lists

  2. Optimal (n=47, 69): among the overlapping genes, the base set plus the additional number of genes that result in the optimal value for a selected evaluation metric, as assessed in Figure 4.18 and Figure 4.19

Table 4.17: Model Comparisons using Different Gene Lists in Confirmation Set
Histotypes
Method Metric Overall HGSOC CCOC ENOC MUOC LGSOC
Hybrid-XGB, Optimal Accuracy 0.836 0.872 0.966 0.893 0.966 0.977
Sensitivity 0.694 0.946 0.861 0.458 0.704 0.500
Specificity 0.930 0.729 0.979 0.979 0.977 0.986
F1-Score 0.684 0.907 0.849 0.587 0.633 0.444
Balanced Accuracy 0.812 0.838 0.920 0.719 0.840 0.743
Kappa 0.670 0.703 0.830 0.531 0.616 0.433
Hybrid-XGB, Base Accuracy 0.810 0.858 0.952 0.872 0.970 0.967
Sensitivity 0.613 0.929 0.875 0.393 0.704 0.167
Specificity 0.923 0.720 0.961 0.968 0.982 0.983
F1-Score 0.606 0.896 0.803 0.506 0.667 0.160
Balanced Accuracy 0.768 0.825 0.918 0.680 0.843 0.575
Kappa 0.620 0.673 0.775 0.440 0.651 0.143
SMOTE-SVM, Optimal Accuracy 0.821 0.860 0.958 0.879 0.974 0.972
Sensitivity 0.666 0.932 0.847 0.458 0.593 0.500
Specificity 0.925 0.720 0.972 0.963 0.990 0.981
F1-Score 0.665 0.898 0.819 0.557 0.653 0.400
Balanced Accuracy 0.796 0.826 0.910 0.710 0.791 0.740
Kappa 0.640 0.676 0.795 0.490 0.639 0.386
SMOTE-SVM, Base Accuracy 0.815 0.860 0.952 0.874 0.970 0.974
Sensitivity 0.672 0.903 0.889 0.514 0.556 0.500
Specificity 0.930 0.775 0.960 0.946 0.989 0.983
F1-Score 0.660 0.895 0.805 0.576 0.612 0.414
Balanced Accuracy 0.801 0.839 0.924 0.730 0.772 0.741
Kappa 0.641 0.685 0.778 0.503 0.597 0.401
Figure 4.20: Gene List Comparisons of Evaluation Metrics in Confirmation Set

4.5 Validation Set

From the results in Table 4.17 and Figure 4.20, we see that the Hybrid-XGB, optimal workflow as the highest overall F1-Score and balanced accuracy. Thus, we choose the Hybrid-XGB model trained on the training set for the all overlap, optimal, and base gene lists, and evaluate performance in the validation set.

4.5.1 Evaluation Metrics

Table 4.18: Evaluation Metrics on Training Set Models in Validation Set
Histotypes
Method Metric Overall HGSOC CCOC ENOC MUOC LGSOC
Hybrid-XGB, All Overlap Accuracy 0.889 0.905 0.969 0.955 0.978 0.972
Sensitivity 0.799 0.913 1.000 0.682 0.870 0.533
Specificity 0.958 0.877 0.966 0.985 0.980 0.980
F1-Score 0.715 0.938 0.831 0.750 0.667 0.390
Balanced Accuracy 0.879 0.895 0.983 0.833 0.925 0.756
Kappa 0.725 0.739 0.815 0.726 0.656 0.377
Hybrid-XGB, Optimal Accuracy 0.876 0.894 0.966 0.949 0.974 0.969
Sensitivity 0.789 0.904 0.971 0.625 0.913 0.533
Specificity 0.952 0.856 0.966 0.984 0.976 0.976
F1-Score 0.692 0.930 0.817 0.705 0.646 0.364
Balanced Accuracy 0.870 0.880 0.969 0.804 0.944 0.755
Kappa 0.694 0.709 0.799 0.677 0.634 0.349
Hybrid-XGB, Base Accuracy 0.843 0.877 0.954 0.923 0.966 0.966
Sensitivity 0.721 0.897 0.942 0.386 0.913 0.467
Specificity 0.937 0.805 0.955 0.981 0.968 0.975
F1-Score 0.615 0.919 0.760 0.496 0.583 0.318
Balanced Accuracy 0.829 0.851 0.949 0.684 0.940 0.721
Kappa 0.611 0.661 0.736 0.458 0.568 0.303
Figure 4.21: Evaluation Metrics on Validation Set Models

4.5.2 Confusion Matrices

Figure 4.22: Confusion Matrix for Training Set Models evaluated on Validation Data

4.5.3 ROC Curves

4.5.4 Calibration Plots

4.5.5 Summary

A summary of the Hybrid-XGB, Optimal model results are shown in Figure 4.29.

Figure 4.29: Validation Summary

4.5.6 Additional Explorations

Table 4.19: Clinicopath characteristics between correct and incorrect predictions of ENOC cases
Characteristic Predicted ENOC Correctly
N = 55
1
Missed ENOC
N = 33
1
p-value2
Age at diagnosis 53 (46, 63) 56 (51, 62) 0.3
Tumour grade

0.003
    low grade 42 (93%) 17 (63%)
    high grade 3 (6.7%) 10 (37%)
    Unknown 10 6
FIGO tumour stage

0.086
    I 42 (78%) 20 (61%)
    II-IV 12 (22%) 13 (39%)
    Unknown 1 0
Race

0.7
    white 50 (93%) 26 (90%)
    non-white 4 (7.4%) 3 (10%)
    Unknown 1 4
ARID1A

>0.9
    absent/subclonal 10 (18%) 6 (18%)
    present 45 (82%) 27 (82%)
WT1

0.3
    diffuse (>50%) 2 (3.6%) 3 (9.1%)
    focal (1-50%) 3 (5.5%) 0 (0%)
    negative 50 (91%) 30 (91%)
TP53

0.059
    mutated 1 (1.8%) 4 (13%)
    wild type 54 (98%) 28 (88%)
    Unknown 0 1
PR

0.010
    diffuse (>50%) 35 (64%) 11 (33%)
    focal (1-50%) 9 (16%) 6 (18%)
    negative 11 (20%) 16 (48%)
P16

0.10
    abnormal block 2 (3.6%) 4 (12%)
    abnormal complete absence 13 (24%) 12 (36%)
    normal 40 (73%) 17 (52%)
NAPSIN A

0.5
    negative 53 (96%) 32 (100%)
    positive 2 (3.6%) 0 (0%)
    Unknown 0 1
1 Median (Q1, Q3); n (%)
2 Wilcoxon rank sum test; Fisher’s exact test; Pearson’s Chi-squared test
Figure 4.30: Volcano Plots of Validation Set Predictions
Figure 4.31: Boxplot of Most Differentially Expressed Genes
Figure 4.32: Subtype Prediction Summary among Predicted HGSOC Samples