Hardware accelerated basic blocks for power-aware intercommunication in HPC and embedded systems

Δεν υπάρχει διαθέσιμη μικρογραφία

Ημερομηνία

2014

Συγγραφείς

Tampouratzis Nikolaos
Ταμπουρατζης Νικολαος

Τίτλος Εφημερίδας

Περιοδικό ISSN

Τίτλος τόμου

Εκδότης

Πολυτεχνείο Κρήτης

Περίληψη

In the past, a transition to the next fabrication process typically translated to more transistors and frequency and less power. The higher frequencies paired with innovations in computer architecture defined the semiconductor industry and research until the mid-90s. At that point architecture research saturated and industry resided to the technology scaling for performance gains. During the mid-00s frequency scaling saturated as well. Transistor count, the only resource which reliably kept scaling, along with intra-chip parallelism, which could leverage and extend the existing knowledge of old-days supercomputers, emerged as the only solution to keep Moore’s law live. In parallel systems, computing nodes cooperate to solve processing intensive problems. The communication between nodes is achieved through a variety of protocols. Traditionally, research has focused on optimizing these protocols and identifying the most suitable ones per system and application. Recently, an attempt to unify the primitive operations of the proposed intercommunication protocols has been realized through the Portals system. Portals offer a set of low level communication routines which can be composed to model complex protocols. However, Portals modularity comes at a performance cost, as communication protocols have been tuned and many of their timing critical parts have been decoupled from the main execution thread and in many cases accelerated as dedicated hardware. This work targets to close the performance gap between a generic and reusable intercommunication layer, Portals, and the several monolithic but highly tuned protocols. A software driven hardware accelerated system is suggested which resides on execution of actual software to highlight the critical parts of the communication routines. Accelerating the bottlenecks starts by modeling the hardware in untimed virtual prototypes and the software in a range of candidate embedded processors. A novel path from hardware prototypes to actual silicon allows rapid characterization of the accelerator in terms of power, performance and area. The suggested approach triggers a speedup from one order of magnitude in bottleneck components of Portals, while it is up to two orders of magnitude faster in both MPI and GA baseline implementations in a recent embedded processor.

Περιγραφή

Λέξεις-κλειδιά

Accelerate Intercommunication cost in HPC and Embedded Systems, Embedded systems (Computer systems), embedded computer systems, embedded systems computer systems, HPC (Computer science), high performance computing, hpc computer science

Παραπομπή

Νικόλαος Ταμπουρατζής, "Hardware accelerated basic blocks for power-aware intercommunication in HPC and embedded systems", Μεταπτυχιακή Διατριβή, Σχολή Ηλεκτρονικών Μηχανικών και Μηχανικών Υπολογιστών, Πολυτεχνείο Κρήτης, Χανιά, Ελλάς, 2014

Έχει διάδοχο το τεκμήριο

Είναι διάδοχο του τεκμηρίου

Περιέχει το τεκμήριο

Είναι μέρος του τεκμηρίου

Αναφέρει το τεκμήριο

Αναφέρεται από το τεκμήριο

Έπεται το τεκμήριο

Προηγείται του τεκμηρίου

Έχει ως έκδοση το τεκμήριο

Αποτελεί έκδοση του τεκμηρίου

Έχει ως συμπληρωματικό το τεκμήριο

Είναι συμπληρωματικό του τεκμηρίου

Έχει μετατραπει στο τεκμήριο

Αποτελεί μετατροπή του τεκμηρίου