Hay fever is an allergic reaction to pollen, which causes individuals to experience discomfort and even impair their quality of sleep and work performance. In order to avoid excessive exposure to allergen sources, hay fever sufferers may benefit from knowing the locations of allergen plants. With that in mind, the aim of this current project is to develop rapeseed field recognition technology to adequately process images captured by drones and produce maps for hay fever sufferers. Specifically, rapeseed fields (a hay fever source) are chosen to be the target object in order to determine the feasibility of the machine-learning based detection technology developed in this project.
The rapeseed detection system has to be able to tell the difference between rapeseed fields and other common yellow flower fields, such as sunflowers, to avoid false alarms. The system is developed using state of the art deep learning models - Convolutional Neural Networks(CNN) and Region-based Convolutional Neural Networks(RCNN), which are trained and deployed on GPUs. The CNN is a widely used image classification model. It is a multi-layer hierarchal model comprising of a powerful feature extraction mechanism and a classifier which can produce highly complex decision boundaries. RCNN is an object detection model that is built on the concept of CNN.
While other object detection systems rely solely on RCNN, this is dependent on the GPU having sufficient computational power. However, the GPU in the present project had limited computational power, which restricts the resolution of images that can be processed during training and deployment. Down-sampled images are undesirable because detailed features are lost, which in turn can severely affect training and deployment. Therefore, a novel algorithm is developed to address this problem. To compensate for the constraints on the GPU, the algorithm combines both CNN and RCNN to analyse images. The two models are trained separately on different data sets. The detection problem is decomposed into two steps. Firstly, flower field proposals are generated by the RCNN. Secondly, the proposals are classified by CNN to distinguish rapeseed fields from other irrelevant sources, such as other flower fields and background scenery.
The success of the detection system is largely contingent on the ability of the CNN to learn features of rapeseed fields. Therefore, the current report will place more emphasis on devising the optimal training strategy for the CNN model. The VGG CNN architecture developed by the Visual Geometry Group is adopted due to its deep architecture that enables a deep hierarchal representation of features to be computed. Having developed mathematical intuition to the model, the appropriate training strategy is applied. Dropout and regularisation techniques are used to reduce the likelihood of misclassifying non-rapeseed fields as rapeseed fields, which creates false alarms for individuals with hay fever.
The performance of the overall detection system is examined both visually with test examples and confusion matrix analysis. Common causes of errors are analysed. A detection accuracy of 86.7% is achieved on the test data set. The compiled Docker image contains all the code and software tools developed in this project and can be used on any operating system by anyone who wish to contribute to this project. Further work is needed to combine the rapeseed detection system developed in this project with computer vision techniques, which in turn can convert image pixels into its corresponding GPS location.
Nvidia-driver updated to 375.23. Ubuntu and caffe tools installed. No GPU/software compatibility issues.
Image augmentation tools in img_aug directory.
Visualisation tools in vis_tools directory. They are used to generate images which maximally certain activate neurons with back propagation.
A simple GUI program is created to annotate images and to generate annotation text files efficiently. Training images in training_img_folder, annotations tools included.
Training data is organised as follows:
|-- .xml (Annotation files generated using labelImg, see later sections)
|-- .jpg (train + test Image files)
|-- train.txt (determines which images in “Images” are for training)
|-- text.txt (determines which images in “Images” are for training)
|-- test (empty before test)
COMMON ERRORS(when training is launched)
File "./tools/train_net.py", line 111, in <module>
imdb, roidb = combined_roidb(args.imdb_name)
File "./tools/train_net.py", line 71, in combined_roidb
print 'roidbs is' , get_roidb(imdb_names)
File "./tools/train_net.py", line 66, in get_roidb
roidb = get_training_roidb(imdb)
line 122, in get_training_roidb
line 27, in prepare_roidb
roidb[i]['image'] = imdb.image_path_at(i)
AssertionError: Number of boxes must match number of ground-truth images
SOLUTION: number of files used to produce train.mat using selective search not
equal to number of images specified in text file (probably because the wrong
train.mat file is used) therefore need to produce a new train.mat file
AssertionError: all(max_classes[nonzero_inds] != 0)
Solution:(NOT SURE) It’s just one of the sanity checks, concerning how much
background annotation boxes overlap compared to stroller/carrier classes.