Training deep neural networks: a static load balancing approach

Moreno Álvarez, Sergio; Haut Hurtado, Juan Mario; Paoletti Ávila, Mercedes Eugenia; Rico Gallego, Juan Antonio; Díaz Martín, Juan Carlos; Plaza Miguel, Javier

Listar por

Estadísticas

Visualiza las estadísticas

Ayuda

Ayuda

Identificador persistente para citar o vincular este elemento: http://hdl.handle.net/10662/20372

0 0

Títulos:	Training deep neural networks: a static load balancing approach
Autores/as:	Moreno Álvarez, Sergio Haut Hurtado, Juan Mario Paoletti Ávila, Mercedes Eugenia Rico Gallego, Juan Antonio Díaz Martín, Juan Carlos Plaza Miguel, Javier
Palabras clave:	Deep learning;High-performance computing;Distributed training;Heterogeneous platforms;Computación de alto rendimiento;HPC;Capacitación distribuida;Computación heterogénea
Fecha de publicación:	2020
Editor/a:	Springer Science+Business Media, LLC, part of Springer Nature
Resumen:	Deep neural networks are currently trained under data-parallel setups on high-per- formance computing (HPC) platforms, so that a replica of the full model is charged to each computational resource using non-overlapped subsets known as batches. Replicas combine the computed gradients to update their local copies at the end of each batch. However, differences in performance of resources assigned to replicas in current heterogeneous platforms induce waiting times when synchronously combin- ing gradients, leading to an overall performance degradation. Albeit asynchronous communication of gradients has been proposed as an alternative, it suffers from the so-called staleness problem. This is due to the fact that the training in each rep- lica is computed using a stale version of the parameters, which negatively impacts the accuracy of the resulting model. In this work, we study the application of well- known HPC static load balancing techniques to the distributed training of deep mod- els. Our approach is assigning a different batch size to each replica, proportional to its relative computing capacity, hence minimizing the staleness problem. Our exper- imental results (obtained in the context of a remotely sensed hyperspectral image processing application) show that, while the classification accuracy is kept constant, the training time substantially decreases with respect to unbalanced training. This is illustrated using heterogeneous computing platforms, made up of CPUs and GPUs with different performance.
URI:	http://hdl.handle.net/10662/20372
ISSN:	0920-8542
DOI:	10.1007/s11227-020-03200-6
Colección:	DISIT - Artículos DTCYC - Artículos

Archivos

Archivo	Descripción	Tamaño	Formato
s11227-020-03200-6.pdf ???org.dspace.app.webui.jsptag.ItemTag.accessRestricted???		1,55 MB	Adobe PDF	Descargar Pide una copia

Vista completa

Este elemento está sujeto a una licencia Licencia Creative Commons