
La communication d'appareil à appareil (D2D) est une technologie émergente dans l'évolution des communications de véhicule à véhicule (V2V) par réseau 5G. C'est une technique de base pour la prochaine génération de nombreuses plates-formes et applications, par exemple le streaming vidéo en temps réel de haute qualité, les jeux de réalité virtuelle et le fonctionnement de la ville intelligente. Cependant, la prolifération rapide des appareils et des capteurs des utilisateurs conduit à la nécessité d'algorithmes d'allocation de ressources plus efficaces pour améliorer les performances du réseau tout en étant capables de garantir la qualité de service. Actuellement, l'apprentissage par renforcement profond se développe comme un outil puissant pour permettre à chaque nœud du réseau d'avoir une capacité d'auto-organisation en temps réel. Dans cet article, nous présentons deux nouvelles approches basées sur un algorithme de gradient de politique déterministe profond, à savoir « gradient de politique déterministe profond distribué » et « gradient de politique déterministe profond partagé », pour le problème d'allocation de puissance multi-agents dans les communications V2V basées sur D2D. Les résultats numériques montrent que nos modèles proposés surpassent d'autres approches d'apprentissage par renforcement profond en termes d'efficacité énergétique et de flexibilité du réseau.
La comunicación de dispositivo a dispositivo (D2D) es una tecnología emergente en la evolución de las comunicaciones de vehículo a vehículo (V2V) habilitadas para la red 5G. Es una técnica central para la próxima generación de muchas plataformas y aplicaciones, por ejemplo, transmisión de video de alta calidad en tiempo real, juegos de realidad virtual y operación de ciudades inteligentes. Sin embargo, la rápida proliferación de dispositivos y sensores de usuario conduce a la necesidad de algoritmos de asignación de recursos más eficientes para mejorar el rendimiento de la red sin dejar de ser capaz de garantizar la calidad del servicio. Actualmente, el aprendizaje por refuerzo profundo se está convirtiendo en una herramienta poderosa para permitir que cada nodo de la red tenga una capacidad de autoorganización en tiempo real. En este documento, presentamos dos enfoques novedosos basados en un algoritmo de gradiente de política determinista profundo, a saber, "gradiente de política determinista profundo distribuido" y "gradiente de política determinista profundo compartido", para el problema de asignación de potencia multiagente en comunicaciones V2V basadas en D2D. Los resultados numéricos muestran que nuestros modelos propuestos superan a otros enfoques de aprendizaje de refuerzo profundo en términos de eficiencia energética y flexibilidad de la red.
Device-to-device (D2D) communication is an emerging technology in the evolution of the 5G network enabled vehicle-to-vehicle (V2V) communications. It is a core technique for the next generation of many platforms and applications, e.g. real-time high-quality video streaming, virtual reality game, and smart city operation. However, the rapid proliferation of user devices and sensors leads to the need for more efficient resource allocation algorithms to enhance network performance while still capable of guaranteeing the quality-of-service. Currently, deep reinforcement learning is rising as a powerful tool to enable each node in the network to have a real-time self-organising ability. In this paper, we present two novel approaches based on deep deterministic policy gradient algorithm, namely "distributed deep deterministic policy gradient" and "sharing deep deterministic policy gradient", for the multi-agent power allocation problem in D2D-based V2V communications. Numerical results show that our proposed models outperform other deep reinforcement learning approaches in terms of the network's energy efficiency and flexibility.
يعد الاتصال من جهاز إلى جهاز (D2D) تقنية ناشئة في تطور اتصالات شبكة الجيل الخامس التي تعمل من مركبة إلى مركبة (V2V). إنها تقنية أساسية للجيل القادم من العديد من المنصات والتطبيقات، مثل بث الفيديو عالي الجودة في الوقت الفعلي، ولعبة الواقع الافتراضي، وتشغيل المدينة الذكية. ومع ذلك، فإن الانتشار السريع لأجهزة المستخدم وأجهزة الاستشعار يؤدي إلى الحاجة إلى خوارزميات تخصيص موارد أكثر كفاءة لتعزيز أداء الشبكة مع الاستمرار في ضمان جودة الخدمة. في الوقت الحالي، يرتفع تعلم التعزيز العميق كأداة قوية لتمكين كل عقدة في الشبكة من الحصول على قدرة تنظيم ذاتي في الوقت الفعلي. في هذه الورقة، نقدم نهجين جديدين يعتمدان على خوارزمية تدرج السياسة الحتمية العميقة، وهما "تدرج السياسة الحتمية العميقة الموزعة" و "مشاركة تدرج السياسة الحتمية العميقة"، لمشكلة تخصيص الطاقة متعددة العوامل في اتصالات V2V القائمة على D2D. تُظهر النتائج العددية أن نماذجنا المقترحة تتفوق على مناهج التعلم المعزز العميق الأخرى من حيث كفاءة الطاقة ومرونتها في الشبكة.
Artificial intelligence, Device-to-Device Communication, Flexibility (engineering), Wireless Energy Harvesting and Information Transfer, name=SDG 7 - Affordable and Clean Energy, /dk/atira/pure/subjectarea/asjc/2500/2500; name=General Materials Science, name=General Computer Science, /dk/atira/pure/sustainabledevelopmentgoals/sustainable_cities_and_communities, /dk/atira/pure/sustainabledevelopmentgoals/affordable_and_clean_energy; name=SDG 7 - Affordable and Clean Energy, Engineering, Quality of service, Reinforcement learning, FOS: Electrical engineering, electronic engineering, information engineering, FOS: Mathematics, /dk/atira/pure/subjectarea/asjc/2500/2500, name=General Engineering, /dk/atira/pure/subjectarea/asjc/2200/2200; name=General Engineering, Electrical and Electronic Engineering, name=General Materials Science, Resource allocation, /dk/atira/pure/subjectarea/asjc/2200/2200, Computer network, /dk/atira/pure/sustainabledevelopmentgoals/affordable_and_clean_energy, multi-agent deep reinforcement learning, Non-cooperative D2D communication, Statistics, Next Generation 5G Wireless Networks, power allocation, Intelligent Reflecting Surfaces in Wireless Communications, Computer science, and deep deterministic policy gradient (DDPG), Distributed computing, name=SDG 11 - Sustainable Cities and Communities, 004, TK1-9971, D2D-based V2V communications, Physical Sciences, /dk/atira/pure/subjectarea/asjc/1700/1700; name=General Computer Science, Electrical engineering. Electronics. Nuclear engineering, /dk/atira/pure/sustainabledevelopmentgoals/sustainable_cities_and_communities; name=SDG 11 - Sustainable Cities and Communities, Mathematics, /dk/atira/pure/subjectarea/asjc/1700/1700
Artificial intelligence, Device-to-Device Communication, Flexibility (engineering), Wireless Energy Harvesting and Information Transfer, name=SDG 7 - Affordable and Clean Energy, /dk/atira/pure/subjectarea/asjc/2500/2500; name=General Materials Science, name=General Computer Science, /dk/atira/pure/sustainabledevelopmentgoals/sustainable_cities_and_communities, /dk/atira/pure/sustainabledevelopmentgoals/affordable_and_clean_energy; name=SDG 7 - Affordable and Clean Energy, Engineering, Quality of service, Reinforcement learning, FOS: Electrical engineering, electronic engineering, information engineering, FOS: Mathematics, /dk/atira/pure/subjectarea/asjc/2500/2500, name=General Engineering, /dk/atira/pure/subjectarea/asjc/2200/2200; name=General Engineering, Electrical and Electronic Engineering, name=General Materials Science, Resource allocation, /dk/atira/pure/subjectarea/asjc/2200/2200, Computer network, /dk/atira/pure/sustainabledevelopmentgoals/affordable_and_clean_energy, multi-agent deep reinforcement learning, Non-cooperative D2D communication, Statistics, Next Generation 5G Wireless Networks, power allocation, Intelligent Reflecting Surfaces in Wireless Communications, Computer science, and deep deterministic policy gradient (DDPG), Distributed computing, name=SDG 11 - Sustainable Cities and Communities, 004, TK1-9971, D2D-based V2V communications, Physical Sciences, /dk/atira/pure/subjectarea/asjc/1700/1700; name=General Computer Science, Electrical engineering. Electronics. Nuclear engineering, /dk/atira/pure/sustainabledevelopmentgoals/sustainable_cities_and_communities; name=SDG 11 - Sustainable Cities and Communities, Mathematics, /dk/atira/pure/subjectarea/asjc/1700/1700
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 43 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
