schliessen

Filtern

 

Bibliotheken

Accelerating PQMRCGSTAB Algorithm on Xeon Phi

Utilizing iterative method to solve the large sparse linear systems is the key to many practical mathematical and physical problems. Recently, Intel released Xeon Phi, a many-core processor of Intel’s Many Integrated Core (MIC) architecture, comprises 60 cores and supports 512-bit SIMD operation. In... Full description

Journal Title: Advanced materials research 2013, Vol.709, p.555-562
Main Author: Wu, Qiang
Other Authors: Qi, Jin , Yao, Wen Ke , Yang, Can Qun , Chen, Cheng
Format: Electronic Article Electronic Article
Language: English
Publisher: Trans Tech Publications Ltd
ID: ISSN: 1022-6680
Zum Text:
SendSend as email Add to Book BagAdd to Book Bag
Staff View
recordid: cdi_crossref_primary_10_4028_www_scientific_net_AMR_709_555
title: Accelerating PQMRCGSTAB Algorithm on Xeon Phi
format: Article
creator:
  • Wu, Qiang
  • Qi, Jin
  • Yao, Wen Ke
  • Yang, Can Qun
  • Chen, Cheng
ispartof: Advanced materials research, 2013, Vol.709, p.555-562
description: Utilizing iterative method to solve the large sparse linear systems is the key to many practical mathematical and physical problems. Recently, Intel released Xeon Phi, a many-core processor of Intel’s Many Integrated Core (MIC) architecture, comprises 60 cores and supports 512-bit SIMD operation. In this work, we aim at accelerating an iterative algorithm for large spare linear system, named PQMRCGSTAB, by using both Xeon Phi’s 8-way vector operation and dense threads. Then, we propose three optimizations to improve the performance: data prefetching to hide the data latency, vector register reusing, and SIMD-friendly reduction. Our experimental evaluation on Xeon Phi delivers a speedup of close to a factor 6 compared to the Intel Xeon E5-2670 octal-core CPU running the same problem.
language: eng
source:
identifier: ISSN: 1022-6680
fulltext: no_fulltext
issn:
  • 1022-6680
  • 1662-8985
  • 1662-8985
url: Link


@attributes
NO1
SEARCH_ENGINEprimo_central_multiple_fe
SEARCH_ENGINE_TYPEPrimo Central Search Engine
RANK2.1780362
LOCALfalse
PrimoNMBib
record
control
sourceidtranstech_cross
recordidTN_cdi_crossref_primary_10_4028_www_scientific_net_AMR_709_555
sourceformatXML
sourcesystemPC
sourcerecordid10_4028_www_scientific_net_AMR_709_555
originalsourceidFETCH-LOGICAL-c2345-953613b4e8df882faf48bb5f6a992da7b2e52d9d6ebbc188f54fff78e705a72c0
addsrcrecordideNqNz0FLwzAUwPEgCs7pd-jJW7s0bdL0ItY5p7DhnBO8hTRN1owulSRS_PZGJgiednnvXd4ffgBcpzDJIaKTYRgSJ7Q0XistEiP9pFqukwKWCcb4BIxSQlBMS4pPww0Rigmh8BxcOLeDkOQpwiMQV0LITlrutdlGq5flejp_3VR3UdVte6t9u496E73LMFatvgRnindOXv3uMXh7mG2mj_Hief40rRaxQFmO4xJnJM3qXNJGUYoUVzmta6wIL0vU8KJGEqOmbIisa5FSqnCulCqoLCDmBRJwDG4OXWF756xU7MPqPbdfLIXsB88Cnv3hWcCzgGcBzwI-BO7_BYT2wdgbb7nujs_cHjLhyzgvRct2_ac1wX5s4hvpQoMv
sourcetypeAggregation Database
isCDItrue
recordtypearticle
display
typearticle
titleAccelerating PQMRCGSTAB Algorithm on Xeon Phi
creatorWu, Qiang ; Qi, Jin ; Yao, Wen Ke ; Yang, Can Qun ; Chen, Cheng
creatorcontribWu, Qiang ; Qi, Jin ; Yao, Wen Ke ; Yang, Can Qun ; Chen, Cheng
descriptionUtilizing iterative method to solve the large sparse linear systems is the key to many practical mathematical and physical problems. Recently, Intel released Xeon Phi, a many-core processor of Intel’s Many Integrated Core (MIC) architecture, comprises 60 cores and supports 512-bit SIMD operation. In this work, we aim at accelerating an iterative algorithm for large spare linear system, named PQMRCGSTAB, by using both Xeon Phi’s 8-way vector operation and dense threads. Then, we propose three optimizations to improve the performance: data prefetching to hide the data latency, vector register reusing, and SIMD-friendly reduction. Our experimental evaluation on Xeon Phi delivers a speedup of close to a factor 6 compared to the Intel Xeon E5-2670 octal-core CPU running the same problem.
identifier
0ISSN: 1022-6680
1ISSN: 1662-8985
2EISSN: 1662-8985
3DOI: 10.4028/www.scientific.net/AMR.709.555
languageeng
publisherTrans Tech Publications Ltd
ispartofAdvanced materials research, 2013, Vol.709, p.555-562
rights2013 Trans Tech Publications Ltd
lds50peer_reviewed
citedbyFETCH-LOGICAL-c2345-953613b4e8df882faf48bb5f6a992da7b2e52d9d6ebbc188f54fff78e705a72c0
citesFETCH-LOGICAL-c2345-953613b4e8df882faf48bb5f6a992da7b2e52d9d6ebbc188f54fff78e705a72c0
links
openurl$$Topenurl_article
thumbnail$$Uhttps://www.scientific.net/Image/TitleCover/2441?width=600
search
creatorcontrib
0Wu, Qiang
1Qi, Jin
2Yao, Wen Ke
3Yang, Can Qun
4Chen, Cheng
title
0Accelerating PQMRCGSTAB Algorithm on Xeon Phi
1Advanced materials research
descriptionUtilizing iterative method to solve the large sparse linear systems is the key to many practical mathematical and physical problems. Recently, Intel released Xeon Phi, a many-core processor of Intel’s Many Integrated Core (MIC) architecture, comprises 60 cores and supports 512-bit SIMD operation. In this work, we aim at accelerating an iterative algorithm for large spare linear system, named PQMRCGSTAB, by using both Xeon Phi’s 8-way vector operation and dense threads. Then, we propose three optimizations to improve the performance: data prefetching to hide the data latency, vector register reusing, and SIMD-friendly reduction. Our experimental evaluation on Xeon Phi delivers a speedup of close to a factor 6 compared to the Intel Xeon E5-2670 octal-core CPU running the same problem.
issn
01022-6680
11662-8985
21662-8985
fulltextfalse
rsrctypearticle
creationdate2013
recordtypearticle
recordideNqNz0FLwzAUwPEgCs7pd-jJW7s0bdL0ItY5p7DhnBO8hTRN1owulSRS_PZGJgiednnvXd4ffgBcpzDJIaKTYRgSJ7Q0XistEiP9pFqukwKWCcb4BIxSQlBMS4pPww0Rigmh8BxcOLeDkOQpwiMQV0LITlrutdlGq5flejp_3VR3UdVte6t9u496E73LMFatvgRnindOXv3uMXh7mG2mj_Hief40rRaxQFmO4xJnJM3qXNJGUYoUVzmta6wIL0vU8KJGEqOmbIisa5FSqnCulCqoLCDmBRJwDG4OXWF756xU7MPqPbdfLIXsB88Cnv3hWcCzgGcBzwI-BO7_BYT2wdgbb7nujs_cHjLhyzgvRct2_ac1wX5s4hvpQoMv
startdate20130627
enddate20130627
creator
0Wu, Qiang
1Qi, Jin
2Yao, Wen Ke
3Yang, Can Qun
4Chen, Cheng
generalTrans Tech Publications Ltd
scope
0AAYXX
1CITATION
sort
creationdate20130627
titleAccelerating PQMRCGSTAB Algorithm on Xeon Phi
authorWu, Qiang ; Qi, Jin ; Yao, Wen Ke ; Yang, Can Qun ; Chen, Cheng
facets
frbrtype5
frbrgroupidcdi_FETCH-LOGICAL-c2345-953613b4e8df882faf48bb5f6a992da7b2e52d9d6ebbc188f54fff78e705a72c0
rsrctypearticles
prefilterarticles
languageeng
creationdate2013
toplevelpeer_reviewed
creatorcontrib
0Wu, Qiang
1Qi, Jin
2Yao, Wen Ke
3Yang, Can Qun
4Chen, Cheng
collectionCrossRef
jtitleAdvanced materials research
delivery
delcategoryRemote Search Resource
fulltextno_fulltext
addata
au
0Wu, Qiang
1Qi, Jin
2Yao, Wen Ke
3Yang, Can Qun
4Chen, Cheng
formatjournal
genrearticle
ristypeJOUR
atitleAccelerating PQMRCGSTAB Algorithm on Xeon Phi
jtitleAdvanced materials research
date2013-06-27
risdate2013
volume709
spage555
epage562
pages555-562
issn
01022-6680
11662-8985
eissn1662-8985
notesSelected, peer reviewed papers from the 2013 International Conference on Applied Science, Engineering and Technology (ICASET 2013), May 19-21, 2013, Qingdao, China
abstractUtilizing iterative method to solve the large sparse linear systems is the key to many practical mathematical and physical problems. Recently, Intel released Xeon Phi, a many-core processor of Intel’s Many Integrated Core (MIC) architecture, comprises 60 cores and supports 512-bit SIMD operation. In this work, we aim at accelerating an iterative algorithm for large spare linear system, named PQMRCGSTAB, by using both Xeon Phi’s 8-way vector operation and dense threads. Then, we propose three optimizations to improve the performance: data prefetching to hide the data latency, vector register reusing, and SIMD-friendly reduction. Our experimental evaluation on Xeon Phi delivers a speedup of close to a factor 6 compared to the Intel Xeon E5-2670 octal-core CPU running the same problem.
pubTrans Tech Publications Ltd
doi10.4028/www.scientific.net/AMR.709.555