Plant Disease "Say" Platform
Plant diseases are one of the main hazards affecting crop growth, which seriously aggravate the shortage of food supply. Timely and effective plant disease diagnosis is very important for crop management.
Application of artificial intelligence in plant disease diagnosis
Ai approaches utilizing image processing, multimodal data fusion, mobile applications, and drone remote sensing technologies have shown tremendous advantages. However, the existing methods of plant disease diagnosis mainly depend on the early detection effect, and complex detection is still challenging.
Advantages of a multimodal approach

Advantages of a multimodal approach

In plant disease diagnosis, multimodal methods combining image and text reports show significant advantages. This method can not only identify the disease more accurately, but also provide more comprehensive diagnostic information.

Improvements in early detection

This method has not only been verified in medical diagnosis, but also provides a better solution for the early detection of plant diseases. With multimodal data fusion, we can identify problems earlier and take action, thereby reducing damage to crops.

Solve the problem of single mode

Similar to advances in smart medicine, solving the problem of single modality in plant disease recognition requires the development of multimodal techniques capable of generating text descriptions. This not only improves diagnostic accuracy, but also provides researchers and farmers with more detailed disease information.

Introduction to new tools
We propose a tool to align plant phenotypes with trait descriptions by progressively masking disease image diagnostic texts. At the same time, we annotated 5,728 disease phenotype images with expert diagnostic texts, and provided annotated texts and trait labels for 210,000 disease images for training and verification.
5,728
Disease phenotype image
210,000
Disease image
Three stages of model training and experimental results
Our approach uses three stages to embed image features into semantic structures, resulting in descriptions that retain feature features and are more appropriate for phenotypic descriptions. The experimental results prove the effectiveness and superiority of our method, indicating that it has a broad application prospect in the field of plant disease diagnosis.
Stage One and Stage Two
In the first and second stages, the model obtained the structural features of special sentences for plant shape description by shielding part of the disease image feature description and adding reconstruction statements to the phenotype description. The main goal of these two stages is to optimize the alignment between image features and text descriptions so that the resulting descriptions accurately reflect the disease characteristics of the plant.
Final Stage
In the final stage, the model is trained to obtain image-text feature selection and alignment, and dialogue descriptions are generated through the model's visual and text feature extractors. This phase focuses on generating high-quality conversation descriptions through visual and textual feature extractors to provide more accurate disease diagnosis and phenotypic description.
Experimental Result
Through experimental validation, our approach outperforms multiple cutting-edge models in testing. Compared to larger models, our approach outperforms GPT-4 and GPT-40 on multiple feature descriptors. These results show that our approach can provide more accurate disease diagnosis and phenotypic description while maintaining high efficiency.
Future outlook
In the future, we plan to add more disease images and corresponding diagnostic texts to continuously optimize the accuracy and adaptability of the model. In addition, we will introduce more deep learning techniques to improve the efficiency of feature selection and comparison to provide more accurate and detailed phenotypic descriptions.