We calculated the global similarity in addition to pairwise tests of each cluster

Cover crop frequency was determined using the average number of cover crop plantings per year, calculated as cover crop planting counts over the course of two growing years for each field site. In order to identify farm typologies based on indicators for soil organic matter levels, we first used several clustering algorithms. First, a k-means cluster analysis based on four key soil indicators—soil organic matter , total soil nitrogen, and available nitrogen —was used to generate three clusters of farm groups using the facoextra and cluster packages in R . The cluster analysis results were divisive, non-hierarchical, and based on Euclidian distance, which calculates the straight-line distance between the soil indicator combinations of every farm site in Cartesian space , and created a matrix of these distances . To determine the appropriate number of clusters for the cluster analysis, a scree plot was used to signal the point at which the total within-cluster sum of squares decreased as a function of the increasing cluster size. The location of the kink in the curve of this scree plot delineated the optimal number of clusters, in this case three clusters . To further explore appropriate cluster size, we used a histogram to determine the structure and spread of data among clusters. A Euclidean-based dendrogram analysis was then used to further validate the results of the cluster analysis. In addition to confirming the results of the cluster analysis, drainage collection pot the dendrogram plot showed relationships between sites and relatedness across all sites.

To visual cluster analysis results, the final three clusters were plotted based on the axes produced by the cluster analysis. One drawback of cluster analyses is that there is no measure of whether the groups identified are the most effective combination to explain clusters produced by soil indicators, or whether they are statistically different from one another. To address this gap, we used ANOSIM to evaluate and compare the differences between clusters identified with the cluster analysis above. To formally establish the three farm types and also make the functional link between organic matter and management explicit, we used the three clusters that emerged from the k-means cluster analysis based on soil organic matter indicators, and explored differences in management approaches among the clusters. We then created three farm types based on this exploratory analysis. Specifically, we first analyzed management practices among sites within each cluster to determine if similarities in management approaches emerged for each cluster. Based on this analysis, we used the three clusters from the cluster analysis to create three farm types categorized by soil organic matter levels and informed by management practices applied. Using the three farm types from above, we then analyzed whether our classification created strong differences along soil texture and management gradients using a linear discriminant analysis . LDA is most frequently used as a pattern recognition technique; because LDA is a supervised classification, class membership must be known prior to analysis .

The analysis tests the within group covariance matrix of standardized variables and generates a probability of each farm sites being categorized in the most appropriate group based on these variable matrices . To characterize soil texture, we used soil texture class . To characterize soil management, we used crop abundance, tillage frequency, and crop rotational complexity—the three management variables with the strongest gradient of difference among the three farm types. A confusion matrix was first applied to determine if farm sites were correctly categorized among the three clusters created by the cluster analysis. Additional indicator statistics were also generated to confirm if the LDA was sensitive to input variables provided. A plot with axis loadings is provided to visualize the results of the LDA and display differences across farm groups visually. The LDA was carried out using the MASS R package. To build on the results of the LDA, we performed a variation partitioning analysis to determine the level of variation in soil organic matter indicators explained by the soil texture variables, soil management variables, and their interactions . VPA was performed using the vegan package in R . Using indicator variables for soil organic matter levels, we performed a k-means cluster analysis to develop a meaningful classification of farms. Scree plot results indicated that three clusters produced the most consistent separation of field sites. As shown in Figure 1, the two dimensional cluster analysis produced a strong first dimension , which explained 86.7% of the separation among the 27 field sites. Total N, total C, POXC, and soil protein variables strongly explained this separation of farm types, as shown by the lack of overlap among the clusters along the Dimension 1 axis.

Histogram results provide a visual summary of linear difference among the three clusters and further confirms minimal overlap among clusters; however, Cluster I and Cluster II fields showed low dissimilarity between values 0 and -2 . Results from the average distance-based linkages of the dendrogram analysis similarly further established the accuracy of field site groupings determined by the cluster analysis. These results indicated that Cluster II sites were more closely related to Cluster III sites compared to Cluster I sites . ANOSIM showed strongly significant global differences among the three clusters , where a value of 1 delineates 0% overlap between clusters. Overall, ANOSIM verified the farm types obtained from the cluster analysis. In addition, ANOSIM pairwise t-tests that compared each individual cluster in pairs confirmed strongly significant dissimilarities between Cluster I and Cluster III sites . ANOSIM pairwise t-tests also indicated that Cluster I sites were significantly divergent from Cluster II sites; however, Cluster I and Cluster II showed less dissimilarities than Cluster II and Cluster III sites . ANOSIM pairwise t-test results were in congruence with the results provided by the histogram . Classification of farm sites using k-means clustering closely matched differences in on-farm management approaches . It is important to note that while general trends between clusters and management emerged, the management practices analyzed here do not fully encompass the management regimes of each farm field site, and are intended to be exploratory rather than definitive. Several general trends emerged across the three farm types . For instance, Farm Type I, 10 liter pot comprised of six field sites, consisted of fields with higher crop abundance values and fields that more frequently planted cover crops compared to Farm Type III. These sites used lower impact machines and applied a lower number of tillage passes compared to Farm Type II and III. In contrast, Farm Type II, also comprised of six field sites, and Farm Type III, comprised of fifteen field sites, represented fields on the lower end of crop abundance values and sites that applied cover crop plantings at a lower frequency than Farm Type I. Farm Type III on average applied a higher number of tillage passes and on average were on the lower end of ICLS index compared to both Farm Type I and Farm Type II. In general, Farm Type II used management approaches that frequently overlapped with Farm Type III, and less frequently overlapped with Farm Type I. Overall, farm types significantly differentiated based on indicators for soil organic matter levels . For all four indicators displayed in Figure 2, differences among the three farm types were highly significant . As visualized in the side-by-side box plot comparisons for all four indicators for soil organic matter levels, Farm Type I consistently showed the highest mean values across all four indicators, while Farm Type III consistently showed the lowest mean values across all four indicators. Farm Type I had mean values of 0.21 mg-N kg-soil-1 for total soil N, 2.3 mg-C kg-soil-1 for total organic C, 787 mg-C kg-soil-1 for POXC, and 7.4 g g-soil-1 for soil protein; compared to Farm Type I, Farm Type III had means values 43% lower for total soil N, 48% lower for total organic C, 58% for POXC, and 66% lower for soil protein. Compared to Farm Type I, Farm Type II had mean values 38% lower for total soil N, 26% lower for total organic C, 28% lower for POXC, and 30% lower for soil protein than Farm Type I. Standard errors for all four indicators are shown in Figure 2.Results of the LDA showed that both linear discriminant factors are most strongly explained by soil texture , as shown by the LDA loadings . Management practices all equally, but weakly, influenced LD1 and LD2 . LD1, which explained 66.3% of the variance, was effective at separating the Farm Type I and Farm Type III . However, Farm Type II overlapped with both Farm Type I and Farm Type III for LD1. In contrast, LD2, which explained 33.6% of the variance, did not display a definitive separation between the Farm Type I and Farm Type III; however, LD2 was effective at separating Farm Type II from Farm Type I and Farm Type III.

LDA accurately discriminated between the three farm types, with an overall accuracy of 90.1% , as shown in Table 8. Model accuracy was high for all three farm types . The model had the greatest sensitivity to Farm Type II and Farm Type III , and low sensitivity to Farm Type I . Both Farm Type I and Farm Type III displayed minimal confusion with Farm Type II, as the comparison of training and validation data details . We determined the proportion of variation in the three farm types accounted for by management and by soil texture . Soil textural class contributed 28% of unique variation , while management contributed 18% of unique variation . The shared contribution for all predictors was 1%, and the overall contribution of all predictors was 47%.We found across all 27 farm sites sampled that gross N mineralization rates ranged from 0.05 – 4.82 µg-NH4+ -N g-soil-1 day-1 and gross N nitrification rates ranged from 0.55 – 5.90 µg-NO3- -N gsoil-1 day-1 . We determined net N mineralization rates ranged from 0.07 – 1.51 µg-NH4+ -N g-soil-1 day-1 , while net N nitrification rates had a wider range from 1.53 – 25.18 µg-NO3- -N g-soil-1 day-1 . We visually compare the six key N cycling variables—pools of inorganic N , and net and gross N rates—across the three farm types . Despite the variation in net and gross N mineralization and nitrification rates, using the farm types developed above, we found that N cycling variables were not significantly different across the three farm types for all six variables examined—based on ANOVA results . Given the variation in gross N rates reported above, we further explored the drivers of this variation in gross N rates using mixed modelling approaches. Table 10 shows results provide for the linear mixed models used for the prediction of potential gross ammonification rates . Soil ammonium concentration and % sand were significant predictors of gross mineralization rates. While not significant, indicators for SOM were selected and also included in the model, based on AIC results. We also provide results from the selected linear mixed model used for prediction of potential gross nitrification rates in Table 11. As shown, indicators for SOM emerged as the sole significant covariate . While not significant, crop abundance was also selected and included in the model, as determined by AIC results.This on-farm study found significant differentiation among the organic farm field sites sampled based on soil organic matter levels—and created a gradient in soil quality among the three farm types. While we found that differences in soil quality were generally aligned with trends in management among sites, soil texture—rather than management—emerged as the stronger driver of soil quality. Though initially, we found that net and gross N cycling rates were not significantly different across farm types, gross N cycling rates showed considerable variation among farm types. To determine drivers of this variation, we explored key predictors for soil N cycling and found that SOM indicators influenced gross N mineralization and nitrification rates, in particular gross nitrification rates. Each of the four indicators for soil organic matter used in this study—total soil N, total organic C, POXC, and soil protein—showed a strong correlation with farm type, and collectively, created a gradient in soil quality .