Taylor Geospatial Institute
facilitySt Louis, United States
Research output, citation impact, and the most-cited recent papers from Taylor Geospatial Institute. Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from Taylor Geospatial Institute
Accurate and efficient estimation of crop biophysical traits, such as leaf chlorophyll concentrations (LCC) and average leaf angle (ALA), is an important bridge between intelligent crop breeding and precision agriculture. While Unmanned Aerial Vehicle (UAV)-based hyperspectral sensors and advanced machine learning models offer high-throughput solutions, collecting sufficient ground truth data for machine learning training can be challenging, leading to models that lack generalizability for practical uses. This study proposes a transfer learning based dual stream neural network (DSNN) called PROSAIL-Net, which leverages the knowledge gained from PROSAIL simulation and improves the estimation of corn LCC and ALA from UAV-borne hyperspectral images. In addition to hyperspectral data, the DSNN also includes solar-sensor geometry data, which was automatically extracted from a cross-grid UAV flight. The hyperspectral branch in the DSNN was also tested with multi-layer perceptron (MLP), long short-term memory (LSTM), gated recurrent unit (GRU), and 1D convolutional neural network (CNN) architectures. The results suggest that the 1D CNN architecture exhibits superior performance compared to MLP, LSTM, and GRU networks when used in the spectral branch of DSNN. PROSAIL-Net outperforms all other modeling scenarios in predicting LCC (R2 0.66, NRMSE 8.81%) and ALA (R2 0.57, NRMSE 24.32%) and the use of multi-angular UAV observations significantly improves the prediction accuracy of both LCC (R2 improved from 0.52 to 0.66) and ALA (R2 improved from 0.35 to 0.57). This study highlights the importance of utilizing large amounts of PROSAIL-simulated data in conjunction with transfer learning and multi-angular UAV observations in precision agriculture.
Recent advances in unmanned aerial vehicles (UAV), mini and mobile sensors, and GeoAI (a blend of geospatial and artificial intelligence (AI) research) are the main highlights among agricultural innovations to improve crop productivity and thus secure vulnerable food systems. This study investigated the versatility of UAV-borne multisensory data fusion within a framework of multi-task deep learning for high-throughput phenotyping in maize. UAVs equipped with a set of miniaturized sensors including hyperspectral, thermal, and LiDAR were collected in an experimental corn field in Urbana, IL, USA during the growing season. A full suite of eight phenotypes was in situ measured at the end of the season for ground truth data, specifically, dry stalk biomass, cob biomass, dry grain yield, harvest index, grain nitrogen utilization efficiency (Grain NutE), grain nitrogen content, total plant nitrogen content, and grain density. After being funneled through a series of radiometric calibrations and geo-corrections, the aerial data were analytically processed in three primary approaches. First, an extended version normalized difference spectral index (NDSI) served as a simple arithmetic combination of different data modalities to explore the correlation degree with maize phenotypes. The extended NDSI analysis revealed the NIR spectra (750–1000 nm) alone in a strong relation with all of eight maize traits. Second, a fusion of vegetation indices, structural indices, and thermal index selectively handcrafted from each data modality was fed to classical machine learning regressors, Support Vector Machine (SVM) and Random Forest (RF). The prediction performance varied from phenotype to phenotype, ranging from R2 = 0.34 for grain density up to R2 = 0.85 for both grain nitrogen content and total plant nitrogen content. Further, a fusion of hyperspectral and LiDAR data completely exceeded limitations of single data modality, especially addressing the vegetation saturation effect occurring in optical remote sensing. Third, a multi-task deep convolutional neural network (CNN) was customized to take a raw imagery data fusion of hyperspectral, thermal, and LiDAR for multi-predictions of maize traits at a time. The multi-task deep learning performed predictions comparably, if not better in some traits, with the mono-task deep learning and machine learning regressors. Data augmentation used for the deep learning models boosted the prediction accuracy, which helps to alleviate the intrinsic limitation of a small sample size and unbalanced sample classes in remote sensing research. Theoretical and practical implications to plant breeders and crop growers were also made explicit during discussions in the studies.
The food production system is vulnerable to diseases more than ever, and the threat is increasing in an era of climate change that creates more favorable conditions for emerging diseases. Fortunately, scientists and engineers are making great strides to introduce farming innovations to tackle the challenge. Unmanned aerial vehicle (UAV) remote sensing is among the innovations and thus is widely applied for crop health monitoring and phenotyping. This study demonstrated the versatility of aerial remote sensing in diagnosing yellow rust infection in spring wheats in a timely manner and determining an intervenable period to prevent yield loss. A small UAV equipped with an aerial multispectral sensor periodically flew over, and collected remotely sensed images of, an experimental field in Chacabuco (−34.64; −60.46), Argentina during the 2021 growing season. Post-collection images at the plot level were engaged in a thorough feature-engineering process by handcrafting disease-centric vegetation indices (VIs) from the spectral dimension, and grey-level co-occurrence matrix (GLCM) texture features from the spatial dimension. A machine learning pipeline entailing a support vector machine (SVM), random forest (RF), and multilayer perceptron (MLP) was constructed to identify locations of healthy, mild infection, and severe infection plots in the field. A custom 3-dimensional convolutional neural network (3D-CNN) relying on the feature learning mechanism was an alternative prediction method. The study found red-edge (690–740 nm) and near infrared (NIR) (740–1000 nm) as vital spectral bands for distinguishing healthy and severely infected wheats. The carotenoid reflectance index 2 (CRI2), soil-adjusted vegetation index 2 (SAVI2), and GLCM contrast texture at an optimal distance d = 5 and angular direction θ = 135° were the most correlated features. The 3D-CNN-based wheat disease monitoring performed at 60% detection accuracy as early as 40 days after sowing (DAS), when crops were tillering, increasing to 71% and 77% at the later booting and flowering stages (100–120 DAS), and reaching a peak accuracy of 79% for the spectral-spatio-temporal fused data model. The success of early disease diagnosis from low-cost multispectral UAVs not only shed new light on crop breeding and pathology but also aided crop growers by informing them of a prevention period that could potentially preserve 3–7% of the yield at the confidence level of 95%.
Abstract Crop yield prediction from UAV images has significant potential in accelerating and revolutionizing crop breeding pipelines. Although convolutional neural networks (CNN) provide easy, accurate and efficient solutions over traditional machine learning models in computer vision applications, a CNN training requires large number of ground truth data, which is often difficult to collect in the agricultural context. The major objective of this study was to develope an end-to-end 3D CNN model for plot-scale soybean yield prediction using multitemporal UAV-based RGB images with approximately 30,000 sample plots. A low-cost UAV-RGB system was utilized and multitemporal images from 13 different experimental fields were collected at Argentina in 2021. Three commonly used 2D CNN architectures (i.e., VGG, ResNet and DenseNet) were transformed into 3D variants to incorporate the temporal data as the third dimension. Additionally, multiple spatiotemporal resolutions were considered as data input and the CNN architectures were trained with different combinations of input shapes. The results reveal that: (a) DenseNet provided the most efficient result (R 2 0.69) in terms of accuracy and model complexity, followed by VGG (R 2 0.70) and ResNet (R 2 0.65); (b) Finer spatiotemporal resolution did not necessarily improve the model performance but increased the model complexity, while the coarser resolution achieved comparable results; and (c) DenseNet showed lower clustering patterns in its prediction maps compared to the other models. This study clearly identifies that multitemporal observation with UAV-based RGB images provides enough information for the 3D CNN architectures to accurately estimate soybean yield non-destructively and efficiently.
ABSTRACTThe Annual Meeting of the American Association of Geographers (AAG) in 2023 marked a five-year milestone since the first Geospatial Artificial Intelligence (GeoAI) Symposium was held at AAG in 2018. In the past five years, progress has been made while open questions remain. In this context, we organized an AAG panel and invited five panellists to discuss the advances and limitations in GeoAI research. The panellists commended the successes, such as the development of spatially explicit models, the production of large-scale geographic datasets, and the use of GeoAI to address real-world problems. The panellists also shared their thoughts on limitations in current GeoAI research, which were considered as opportunities to engage theories in geography, enhance model explainability, quantify uncertainty, and improve model generalizability. This article summarizes the presentations from the panellists and also provides after-panel thoughts from the organizers. We hope that this article can make these thoughts more accessible to interested readers and help stimulate new ideas for future breakthroughs.
The pre-harvest estimation of seed composition from standing crops is imperative for field management practices and plant phenotyping. This paper presents for the first time the potential of Unmanned Aerial Vehicles (UAV)-based high-resolution hyperspectral and LiDAR data acquired from in-season stand crops for estimating seed protein and oil compositions of soybean and corn using multisensory data fusion and automated machine learning. UAV-based hyperspectral and LiDAR data was collected during the growing season (reproductive stage five (R5)) of 2020 over a soybean test site near Columbia, Missouri and a cornfield at Urbana, Illinois, USA. Canopy spectral and texture features were extracted from hyperspectral imagery, and canopy structure features were derived from LiDAR point clouds. The extracted features were then used as input variables for automated machine-learning methods available with the H2O Automated Machine-Learning framework (H2O-AutoML). The results presented that: (1) UAV hyperspectral imagery can successfully predict both the protein and oil of soybean and corn with moderate accuracies; (2) canopy structure features derived from LiDAR point clouds yielded slightly poorer estimates of crop-seed composition compared to the hyperspectral data; (3) regardless of machine-learning methods, the combination of hyperspectral and LiDAR data outperformed the predictions using a single sensor alone, with an R2 of 0.79 and 0.67 for corn protein and oil and R2 of 0.64 and 0.56 for soybean protein and oil; and (4) the H2O-AutoML framework was found to be an efficient strategy for machine-learning-based data-driven model building. Among the specific regression methods evaluated in this study, the Gradient Boosting Machine (GBM) and Deep Neural Network (NN) exhibited superior performance to other methods. This study reveals opportunities and limitations for multisensory UAV data fusion and automated machine learning in estimating crop-seed composition.
Soybean is a pivotal agricultural commodity around the world, primarily because of its high seed protein and oil concentration. Therefore, farmers, breeders and end-users are highly interested in understanding and predicting the soybean seed composition traits from the individual field level or agroecosystem. Seed composition traits are the proportions of different chemical and physical makeup of soybean seeds. Frequent daily coverage of PlanetScope (PS) satellite provides a unique opportunity of estimating seed composition due to its ability to track crop growth and development with its unique combination of high spatial and temporal resolution. We aim to predict six different soybean seed composition traits (i.e., protein, oil, sucrose, fiber, ash, starch) using PS imagery of standing soybean crops and machine learning algorithms. We developed multi-stream deep neural network which is based on two types of recurrent neural networks, i.e., long short-term memory (LSTM) and gated recurrent unit (GRU) that utilize temporal phenology observed from PS. Four statistical machine learning algorithms, i.e., partial least squares (PLSR), random forest (RFR), gradient boosting machine (GBM), support vector machine (SVR) were used for comparison. Our results show that GRU worked well for protein (R2 0.36, NRMSE 3.62%) and oil (R2 0.53, NRMSE 4.78%), SVR showed the best results for sucrose (R2 0.74, NRMSE 8.34%), fiber (R2 0.21, NRMSE 4.20%), and starch (R2 0.15, NRMSE 16.84%), and PLSR provided the best result for ash (R2 0.60, NRMSE 1.70%). Among the features, vegetation indices at later reproductive stages were found as the most important variables compared to texture features. Overall, the study reveals the feasibility and efficiency of PS images and machine learning for plot-level seed composition estimation.
Introduction As emerging infectious diseases (EIDs) increase, examining the underlying social and environmental conditions that drive EIDs is urgently needed. Ecological niche modeling (ENM) is increasingly employed to predict disease emergence based on the spatial distribution of biotic conditions and interactions, abiotic conditions, and the mobility or dispersal of vector-host species, as well as social factors that modify the host species’ spatial distribution. Still, ENM applied to EIDs is relatively new with varying algorithms and data types. We conducted a systematic review (PROSPERO: CRD42021251968) with the research question: What is the state of the science and practice of estimating ecological niches via ENM to predict the emergence and spread of vector-borne and/or zoonotic diseases? Methods We searched five research databases and eight widely recognized One Health journals between 1995 and 2020. We screened 383 articles at the abstract level (included if study involved vector-borne or zoonotic disease and applied ENM) and 237 articles at the full-text level (included if study described ENM features and modeling processes). Our objectives were to: (1) describe the growth and distribution of studies across the types of infectious diseases, scientific fields, and geographic regions; (2) evaluate the likely effectiveness of the studies to represent ecological niches based on the biotic, abiotic, and mobility framework; (3) explain some potential pitfalls of ENM algorithms and techniques; and (4) provide specific recommendation for future studies on the analysis of ecological niches to predict EIDs. Results We show that 99% of studies included mobility factors, 90% modeled abiotic factors with more than half in tropical climate zones, 54% modeled biotic conditions and interactions. Of the 121 studies, 7% include only biotic and mobility factors, 45% include only abiotic and mobility factors, and 45% fully integrated the biotic, abiotic, and mobility data. Only 13% of studies included modifying social factors such as land use. A majority of studies (77%) used well-recognized ENM algorithms (MaxEnt and GARP) and model selection procedures. Most studies (90%) reported model validation procedures, but only 7% reported uncertainty analysis. Discussion Our findings bolster ENM to predict EIDs that can help inform the prevention of outbreaks and future epidemics. Systematic review registration https://www.crd.york.ac.uk/prospero/ , identifier (CRD42021251968).
Wheat, being the third largest U.S. crop and the principal food grain, faces significant risks from climate extremes such as drought. This necessitates identifying and developing methods for early water-stress detection to prevent yield loss and improve water-use efficiency. This study investigates the potential of hyperspectral imaging to detect the early stages of drought stress in wheat. The goal is to utilize this technology as a tool for screening and selecting drought-tolerant wheat genotypes in breeding programs. Additionally, this research aims to systematically evaluate the effectiveness of various existing sensors and methods for detecting early stages of water stress. The experiment was conducted in a durum wheat experimental field trial in Maricopa, Arizona, in the spring of 2019 and included well-watered and water-limited treatments of a panel of 224 replicated durum wheat genotypes. Spectral indices derived from hyperspectral imagery were compared against other plant-level indicators of water stress such as Photosystem II (PSII) and relative water content (RWC) data derived from proximal sensors. Our findings showed a 12% drop in photosynthetic activity in the most affected genotypes when compared to the least affected. The Leaf Water Vegetation Index 1 (LWVI1) highlighted differences between drought-resistant and drought-susceptible genotypes. Drought-resistant genotypes retained 43.36% more water in leaves under well-watered conditions compared to water-limited conditions, while drought-susceptible genotypes retained only 15.69% more. The LWVI1 and LWVI2 indices, aligned with the RWC measurements, revealed a strong inverse correlation in the susceptible genotypes, underscoring their heightened sensitivity to water stress in earlier stages. Several genotypes previously classified based on their drought resistance showed spectral indices deviating from expectations. Results from this research can aid farmers in improving crop yields by informing early management practices. Moreover, this research offers wheat breeders insights into the selection of drought-tolerant genotypes, a requirement that is becoming increasingly important as weather patterns continue to change.
Hyperspectral sensors provide near-continuous spectral data that can facilitate advancements in agricultural crop classification and characterization, which are important for addressing global food and water security issues. We investigated two new-generation hyperspectral sensors, Germany’s Deutsches Zentrum für Luft‐ und Raumfahrt Earth Sensing Imaging Spectrometer (DESIS) and Italy’s PRecursore IperSpettrale della Missione Applicativa (PRISMA), within California's Central Valley in August 2021 focusing on five irrigated agricultural crops (alfalfa, almonds, corn, grapes, and pistachios). With reference data from the U.S. Department of Agriculture Cropland Data Layer, we developed a spectral library of the crops and classified them using three machine learning algorithms (support vector machines [SVM], random forest [RF], and spectral angle mapper [SAM]) and two philosophies: 1. Full spectral analysis (FSA) and 2. Optimal hyperspectral narrowband (OHNB) analysis. For FSA, we used 59 DESIS four-bin product bands and 207 of 238 PRISMA bands. For OHNB analysis, 9 DESIS and 16 PRISMA nonredundant OHNBs for studying crops were selected. FSA achieved only 1% to 3% higher accuracies relative to OHNB analysis in most cases. SVM provided the best results, closely followed by RF. Using both DESIS and PRISMA image OHNBs in SVM for classification led to higher accuracy than using either image alone, with an overall accuracy of 99%, producer’s accuracies of 94% to 100%, and user???s accuracies of 95% to 100%.
Crop field boundaries are foundational datasets for agricultural monitoring and assessments but are expensive to collect manually. Machine learning (ML) methods for automatically extracting field boundaries from remotely sensed images could help realize the demand for these datasets at a global scale. However, current ML methods for field instance segmentation lack sufficient geographic coverage, accuracy, and generalization capabilities. Further, research on improving ML methods is restricted by the lack of labeled datasets representing the diversity of global agricultural fields. We present Fields of The World (FTW)---a novel ML benchmark dataset for agricultural field instance segmentation spanning 24 countries on four continents (Europe, Africa, Asia, and South America). FTW is an order of magnitude larger than previous datasets with 70,462 samples, each containing instance and semantic segmentation masks paired with multi-date, multi-spectral Sentinel-2 satellite images. We provide results from baseline models for the new FTW benchmark, show that models trained on FTW have better zero-shot and fine-tuning performance in held-out countries than models that aren't pre-trained with diverse datasets, and show positive qualitative zero-shot results of FTW models in a real-world scenario -- running on Sentinel-2 scenes over Ethiopia.
Species Distribution Modelling (SDM) techniques, developed in the 1980s, have gained significant attention in recent years. These techniques are increasingly recognized as powerful tools to support forest management strategies in the context of climate change. This study presents a comprehensive literature review of SDM techniques in mountainous environments, utilizing remote sensing techniques and data. Forty-one published papers were reviewed, covering 25 years (1997–2022). The review explores various SDM techniques, the use of remotely sensed data, accuracy assessments, environmental variables, and the limitations and challenges of species distribution modeling in mountainous environments across different spatial scales. The study revealed that the most widely used SDM techniques were Maximum Entropy (MaxEnt), Random Forest (RF), and Generalized Linear Models (GLMs), with recent studies emphasizing machine learning. We describe different modeling algorithms, including presence-only and presence/absence modeling algorithms, machine-learning algorithms, distance-based algorithms, and regression-based algorithms. This study presents the first global literature review of SDM techniques in mountainous environments, emphasizing the necessity of considering the uncertainties associated with climate change scenarios. This study also argues the strengths and limitations of SDM techniques in mountainous environments. Despite limitations of SDM techqniues, the study found an increasing trend in their application in mountainous environments. Finally, this review aims to provide a valuable resource for forest managers, researchers, practitioners, and policymakers employed in forest conservation in mountainous environments around the globe. • Review of forty-two papers between 1997 and 2022 on SDMs using remote sensing data. • Increasing trend of using machine learning methods for SDMs in mountain environments. • Discussion on coping with uncertainties in SDMs.
Soybean is an essential crop to fight global food insecurity and is of great economic importance around the world. Along with genetic improvements aimed at boosting yield, soybean seed composition also changed. Since conditions during crop growth and development influences nutrient accumulation in soybean seeds, remote sensing offers a unique opportunity to estimate seed traits from the standing crops. Capturing phenological developments that influence seed composition requires frequent satellite observations at higher spatial and spectral resolutions. This study introduces a novel spectral fusion technique called multiheaded kernel-based spectral fusion (MKSF) that combines the higher spatial resolution of PlanetScope (PS) and spectral bands from Sentinel 2 (S2) satellites. The study also focuses on using the additional spectral bands and different statistical machine learning models to estimate seed traits, e.g., protein, oil, sucrose, starch, ash, fiber, and yield. The MKSF was trained using PS and S2 image pairs from different growth stages and predicted the potential VNIR1 (705 nm), VNIR2 (740 nm), VNIR3 (783 nm), SWIR1 (1610 nm), and SWIR2 (2190 nm) bands from the PS images. Our results indicate that VNIR3 prediction performance was the highest followed by VNIR2, VNIR1, SWIR1, and SWIR2. Among the seed traits, sucrose yielded the highest predictive performance with RFR model. Finally, the feature importance analysis revealed the importance of MKSF-generated vegetation indices from fused images.
Localization is a primary concern for wireless sensor networks as numerous applications rely on the precise position of nodes. This paper presents a precise deep learning (DL) approach for DV-Hop localization in the Internet of Things (IoT) using the whale optimization algorithm (WOA) to alleviate shortcomings of traditional DV-Hop. Our method leverages a deep neural network (DNN) to estimate distances between undetermined nodes (non-coordinated nodes) and anchor nodes (coordinated nodes) without imposing excessive costs on IoT infrastructure, while DL techniques require extensive training data for accuracy, we address this challenge by introducing a data augmentation strategy (DAS). The proposed algorithm involves creating virtual anchors strategically around real anchors, thereby generating additional training data and significantly enhancing dataset size, improving the efficacy of DNNs. Simulation findings suggest that the proposed deep learning model on DV-Hop localization outperforms other localization methods, particularly regarding positional accuracy.
Soybean seed composition, particularly protein and oil content, plays a critical role in agricultural practices, influencing crop value, nutritional quality, and marketability. Accurate and efficient methods for predicting seed composition are essential for optimizing crop management and breeding strategies. This study assesses the effectiveness of combining handheld spectroradiometers with the Mexican Hat wavelet transformation to predict soybean seed composition at both seed and canopy levels. Initial analyses using raw spectral data from these devices showed limited predictive accuracy. However, by using the Mexican Hat wavelet transformation, meaningful features were extracted from the spectral data, significantly enhancing prediction performance. Results showed improvements: for seed-level data, Partial Least Squares Regression (PLSR), a method used to reduce spectral data complexity while retaining critical information, showed R2 values increasing from 0.57 to 0.61 for protein content and from 0.58 to 0.74 for oil content post-transformation. Canopy-level data analyzed with Random Forest Regression (RFR), an ensemble method designed to capture non-linear relationships, also demonstrated substantial improvements, with R2 increasing from 0.07 to 0.44 for protein and from 0.02 to 0.39 for oil content post-transformation. These findings demonstrate that integrating handheld spectroradiometer data with wavelet transformation bridges the gap between high-end spectral imaging and practical, accessible solutions for field applications. This approach not only improves the accuracy of seed composition prediction at both seed and canopy levels but also supports more informed decision-making in crop management. This work represents a significant step towards making advanced crop assessment tools more accessible, potentially improving crop management strategies and yield optimization across various farming scales.
The potential of artificial intelligence (AI) and machine learning (ML) in agriculture for improving crop yields and reducing the use of water, fertilizers, and pesticides remains a challenge. The goal of this work was to introduce Hyperfidelis, a geospatial software package that provides a comprehensive workflow that includes imagery visualization, feature extraction, zonal statistics, and modeling of key agricultural traits including chlorophyll content, yield, and leaf area index in a ML framework that can be used to improve food security. The platform combines a user-friendly graphical user interface with cutting-edge machine learning techniques, bridging the gap between plant science, agronomy, remote sensing, and data science without requiring users to possess any coding knowledge. Hyperfidelis offers several data engineering and machine learning algorithms that can be employed without scripting, which will prove essential in the plant science community.
ABSTRACT: River morphology data are critical for understanding and studying river processes and for managing rivers for multiple socio-economic uses. While such data have been acquired extensively over time, several issues hinder their use for river morphology studies such as data accessibility, variety of data formats, lack of data models for data storage, and lack of processing tools to assemble the data in products readily usable for research, management, and education. A multi-university research team has prototyped a web-based river morphology information system (RIMORPHIS) for hosting and creating new information and data processing tools to be shared with the broader earth science communities. The RIMORPHIS design principles include: (i) broad access via a publicly and freely available platform-independent system; (ii) flexibility in handling existing and future data types; (iii) user-friendly and interactive interfaces; and (iv) interoperability and scalability to ensure platform sustainability. Development of such an ambitious community resource is only possible by continuously engaging stakeholders from the inception of the project. This paper highlights the research team’s strategy and activities to connect and engage with river morphology data producers and potential users from academia, research, and practice. The paper also details outcomes of stakeholder engagement and illustrates how these interactions are positively shaping RIMORPHIS development.
Monitoring plantations is crucial for crop management and producing healthy harvests. Unmanned Aerial Vehicles (UAVs) have been used to collect multispectral images that aid in this monitoring. However, given the number of hectares to be monitored and the limitations of flight, plant disease signals become visually clear only in the later stages of plant growth and only if the disease has spread throughout a significant portion of the plantation. This limited amount of relevant data hampers the prediction models, as the algorithms struggle to generalize patterns with unbalanced or unrealistic augmented datasets effectively. To address this issue, we propose PlantPlotGAN, a physics-informed generative model capable of creating synthetic multispectral plot images with realistic vegetation indices. These indices served as a proxy for disease detection and were used to evaluate if our model could help increase the accuracy of prediction models. The results demonstrate that the synthetic imagery generated from PlantPlotGAN outperforms state-of-the-art methods regarding the Frichet inception distance. Moreover, prediction models achieve higher accuracy metrics when trained with synthetic and original imagery for earlier plant disease detection compared to the training processes based solely on real imagery.
Wheat is a globally cultivated cereal crop with substantial protein content present in its seeds. This research aimed to develop robust methods for predicting seed protein concentration in wheat seeds using bench-top hyperspectral imaging in the visible, near-infrared (VNIR), and shortwave infrared (SWIR) regions. To fully utilize the spectral and texture features of the full VNIR and SWIR spectral domains, a computer-vision-aided image co-registration methodology was implemented to seamlessly align the VNIR and SWIR bands. Sensitivity analyses were also conducted to identify the most sensitive bands for seed protein estimation. Convolutional neural networks (CNNs) with attention mechanisms were proposed along with traditional machine learning models based on feature engineering including Random Forest (RF) and Support Vector Machine (SVM) regression for comparative analysis. Additionally, the CNN classification approach was used to estimate low, medium, and high protein concentrations because this type of classification is more applicable for breeding efforts. Our results showed that the proposed CNN with attention mechanisms predicted wheat protein content with R2 values of 0.70 and 0.65 for ventral and dorsal seed orientations, respectively. Although, the R2 of the CNN approach was lower than of the best performing feature-based method, RF (R2 of 0.77), end-to-end prediction capabilities with CNN hold great promise for the automation of wheat protein estimation for breeding. The CNN model achieved better classification of protein concentrations between low, medium, and high protein contents, with an R2 of 0.82. This study’s findings highlight the significant potential of hyperspectral imaging and machine learning techniques for advancing precision breeding practices, optimizing seed sorting processes, and enabling targeted agricultural input applications.
Spatial clusters contain biases and artifacts, whether they are defined via statistical algorithms or via expert judgment. Graph‐based partitioning of spatial data and associated heuristics gained popularity due to their scalability but can define suboptimal regions due to algorithmic biases such as chaining. Despite the broad literature on deterministic regionalization methods, approaches that quantify regionalization probability are sparse. In this article, we propose a local method to quantify regionalization probabilities for regions defined via graph‐based cuts and expert‐defined regions. We conceptualize spatial regions as consisting of two types of spatial elements: core and swing. We define three distinct types of regionalization biases that occur in graph‐based methods and showcase the use of the proposed method to capture these types of biases. Additionally, we propose an efficient solution to the probabilistic graph‐based regionalization problem via performing optimal tree cuts along random spanning trees within an evidence accumulation framework. We perform statistical tests on synthetic data to assess resulting probability maps for varying distinctness of underlying regions and regionalization parameters. Lastly, we showcase the application of our method to define probabilistic ecoregions using climatic and remotely sensed vegetation indicators and apply our method to assign probabilities to the expert‐defined Bailey's ecoregions.