Enhancing Distributed Neural Network Training through Node-Based Communications

Moreno Álvarez, Sergio; Paoletti Ávila, Mercedes Eugenia; Cavallaro, Gabriele; Haut Hurtado, Juan Mario

Listar por

Estadísticas

Visualiza las estadísticas

Ayuda

Ayuda

Identificador persistente para citar o vincular este elemento: http://hdl.handle.net/10662/21063

0 0

Registro completo de Metadatos

Campo DC	Valor	idioma
dc.contributor.author	Moreno Álvarez, Sergio	-
dc.contributor.author	Paoletti Ávila, Mercedes Eugenia	-
dc.contributor.author	Cavallaro, Gabriele	-
dc.contributor.author	Haut Hurtado, Juan Mario	-
dc.date.accessioned	2024-04-19T10:23:15Z	-
dc.date.available	2024-04-19T10:23:15Z	-
dc.date.issued	2023	-
dc.description	Versión aceptada del artículo publicado en IEEE Transactions on Neural Networks and Learning Systems, pp. 1-15.2023. ISSN 2162237X	-
dc.description.abstract	The amount of data needed to effectively train modern deep neural architectures has grown significantly, leading to increased computational requirements. These intensive computations are tackled by the combination of last generation computing resources, such as accelerators, or classic processing units. Nevertheless, gradient communication remains as the major bottleneck, hindering the efficiency notwithstanding the improvements in runtimes obtained through data parallelism strategies. Data parallelism involves all processes in a global exchange of potentially high amount of data, which may impede the achievement of the desired speedup and the elimination of noticeable delays or bottlenecks. As a result, communication latency issues pose a significant challenge that profoundly impacts the performance on distributed platforms. This research presents node-based optimization steps to significantly reduce the gradient exchange between model replicas whilst ensuring model convergence. The proposal serves as a versatile communication scheme, suitable for integration into a wide range of general-purpose deep neural network (DNN) algorithms. The optimization takes into consideration the specific location of each replica within the platform. To demonstrate the effectiveness, different neural network approaches and datasets with disjoint properties are used. In addition, multiple types of applications are considered to demonstrate the robustness and versatility of our proposal. The experimental results show a global training time reduction whilst slightly improving accuracy. Code: https://github.com/mhaut/eDNNcom.	es_ES
dc.description.sponsorship	This work was supported in part by the Consejería de Economía, Ciencia y Agenda Digital of the Junta de Extremadura, in part by the European Regional Development Fund (ERDF) of the European Union under Grant GR21040, Grant GR21099 and Grant IB20040, in part by the Spanish Ministerio de Ciencia e Innovacion under Project PID2019-110315RB-I00 (APRISA), in part by the DEEP-EST Project (computing resources), and in part by the European Union’s Horizon 2020 Research and Innovation Programme under Grant 754304.	es_ES
dc.format.extent	15 p.	es_ES
dc.format.mimetype	application/pdf	en_US
dc.language.iso	eng	es_ES
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject	Paralelismo de datos	es_ES
dc.subject	Enseñanza profunda	es_ES
dc.subject	Computación de alto rendimiento	es_ES
dc.subject	Redes neuronales	es_ES
dc.subject	Comunicación síncrona	es_ES
dc.subject	Data parallelism	es_ES
dc.subject	Deep learning	es_ES
dc.subject	High-performance computing	es_ES
dc.subject	Neural networks	es_ES
dc.subject	Synchronous communications	es_ES
dc.title	Enhancing Distributed Neural Network Training through Node-Based Communications	es_ES
dc.type	preprint	es_ES
dc.description.version	peerReviewed	es_ES
europeana.type	TEXT	en_US
dc.rights.accessRights	openAccess	es_ES
dc.subject.unesco	2490.02 Neuroquímica	es_ES
dc.subject.unesco	2490 Neurociencias	es_ES
europeana.dataProvider	Universidad de Extremadura. España	es_ES
dc.identifier.bibliographicCitation	S. Moreno-Álvarez, M. E. Paoletti, G. Cavallaro and J. M. Haut, "Enhancing Distributed Neural Network Training Through Node-Based Communications," in IEEE Transactions on Neural Networks and Learning Systems, doi: 10.1109/TNNLS.2023.3309735	es_ES
dc.type.version	acceptedVersion	es_ES
dc.contributor.affiliation	Universidad de Extremadura. Departamento de Ingeniería de Sistemas Informáticos y Telemáticos	es_ES
dc.contributor.affiliation	Universidad de Extremadura. Departamento de Tecnología de los Computadores y de las Comunicaciones	es_ES
dc.relation.publisherversion	https://ieeexplore.ieee.org/document/10254237	es_ES
dc.identifier.doi	10.1109/TNNLS.2023.3309735	-
dc.identifier.publicationfirstpage	1	es_ES
dc.identifier.publicationlastpage	15	es_ES
dc.identifier.orcid	0000-0002-1858-9920	es_ES
dc.identifier.orcid	0000-0003-1030-3729	es_ES
dc.identifier.orcid	0000-0002-3239-9904	es_ES
dc.identifier.orcid	0000-0001-6701-961X	es_ES
Colección:	DISIT - Artículos DTCYC - Artículos

Archivos

Archivo	Descripción	Tamaño	Formato
TNNLS_2023_3309735.pdf		3,11 MB	Adobe PDF	Descargar

Vista resumida

Este elemento está sujeto a una licencia Licencia Creative Commons