
In this paper, we study a deep learning (DL)-based multimodal technology for military, surveillance, and defense applications based on a pixel-by-pixel classification of soldier’s image dataset. We explore the acquisition of images from a remote tactical-robot to a ground station, where the detection and tracking of soldiers can help the operator to take actions or automate the tactical-robot in battlefield. The soldier detection is achieved by training a convolutional neural network to learn the patterns of the soldier’s uniforms. Our CNN learns from the initial dataset and from the actions taken by the operator, as opposed to the old-fashioned and hard-coded image processing algorithms. Our system attains an accuracy of over 81% in distinguishing the specific soldier uniform and the background. These experimental results prove our hypothesis that dilated convolutions can increase the segmentation performance when compared with patch-based, and fully connected networks.
Electronic computers. Computer science, Q300-390, QA75.5-76.95, Cybernetics
Electronic computers. Computer science, Q300-390, QA75.5-76.95, Cybernetics
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 4 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
