
doi: 10.26021/16084
The Traffic Sign Detection (TSD) and Traffic Sign Recognition (TSR) are well-studied topics, as they play a crucial role in driver assistance systems, automatic driving, and safe driving systems. Usually the TSD task and the TSR task are considered as two separate tasks. The TSD task is to locate the presence and position of traffic signs within an image or video frame, and the TSR task is to classify the detected potential traffic sign (Oza, 2021). In this research project, TSD task and TSR task were solved by convolutional neural network models. The YOLO (You Only Look Once) algorithm is a real-time object detection system to predict bounding boxes and class probabilities (AI Rabbani Alif, 2024). This algorithm was implemented for the TSD model, and transfer learning was used to set the initial weights of the YOLO backbone of the model to deal with the fact that the size of the GTSDB dataset is not large enough. A 6-layer-3-block convolution neural network was built from scratch as the TSR model. The German Traffic Sign Detection Benchmark (GTSDB) dataset and the German Traffic Sign Recognition Benchmark (GTSRB) data set are used for training the two models respectively. Based on the models created for TSD and TSR tasks, further research, such as hyper-parameter tuning, transfer learning, and dataset expansion were carried out, and their results were analyzed. For hyper parameter tuning research, the selection for batch size, epoch number, network size, dropout layer number, dropout rate and others were analyzed. For transfer learning, the Modified National Institute of Standard and Technology (MNIST) data set was used, and the transfer learning research was carried out between MNIST and GTSRB to test the performance difference between transferred models and models trained from scratch. With the TSD model and the TSR model, a computational pipeline was built in order to take images or videos as its inputs, make predictions about how many traffic signs were there in the image/frame, where they were located and what class they belonged to. In order to solve the gaps between the coverage of the GTSDB dataset and the coverage of the GTSRB dataset, a manual expansion was made on the GTSRB dataset in this research, and it helped the model trained by expanded GTSRB to be able to recognize “Unknown” traffic sign types. Trackers were built for video processing in order to make the displayed result more stable. Three types of trackers were built, including Naive tracker, Kalman filter tracker and Unscented Kalman filter tracker. Their performance on videos were compared. The machine learning-based pipeline was built based on an existing innovation project owned by Verizon Connect. With this pipeline, data captured from truck-based cameras would be used as the input, and the type of traffic sign and its geo-information would be the output.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
