Modified pHPF SPMDizerDataPartitioningPreprocessingCommunicationAnalysisCommunicationCode GenerationDataflow/DependenceAnalyzerLoopTransformer2.5x 1088Bcopyx 1071. Find Earliest and Latest2. Mark candidate statements3. Eliminate subsets4. Eliminate redundancy5. Choose final candidate2Bandwidth (B/s)andwidth (B/s)6Bcopy41.5Trace dump to lst filefor hand compilation1DataPartitioningPostprocessingB 2PS0.5SendRecv0105Buffer size (Bytes) WON2Send0Recv102104106Buffer size (Bytes)Pre-headerHeaderZero-trip
LoopLoop-backedge
BodyedgeExitPost-exitφdφ CPU NW 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1gdbgdbgdbgdbgdbgdbgdbgdb 0ireeeireireormirrmirrmrmmiremiremirerm ooo ooo ooo oor oor oor oo oncnccocococococ n n n n n n n=100 n=125 n=150 n=175 n=200 n=225 n=250 n=275 CPU NW 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1gdbgdbgdbgdbgdbgdbgdbgdbgdb 0ireormiremirermirermirermirermirermiirere ooor oo ooormorm oo oo oo ncococococooooo oo n n nncncncnc n n=100 n=150 n=200 n=250 n=300 n=350 n=400 n=450 n=500 CPU NW 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1gdbgdbgdbgdbgdbgdb 0iemiemiemiemieirrrrrreororororormorm oo oo oo oo oo ooncncncncncnc n=28 n=32 n=40 n=48 n=56 n= CPU NW 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1gbgbgbgbgbgbgbgbgbgb 0iiiirrromirmirmiromrmirmirmir oo ooom o o ooom oo ooom o o occcccccccc n=100 n=125 n=150 n=175 n=200 n=225 n=250 n=275 n=300 n=325
CPU NW 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1gbgbgbgbgbgbgbgb 0imimimimimiirrrrrrromirmomoooooooooo oo o occcccccc n=100 n=124 n=150 n=174 n=200 n=224 n=250 n=274
CPU NW 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1
gbgbgbgbgb 0
imimimimirrrrrooooooooom occccc n= n=128 n=192 n=256 n=320