Cross Reference: /external/openssl/crypto/bn/asm/ia64.S

Lines Matching refs:loop
18 // "wider" than Itanium? Can you experience loop scalability as
37 // Wrong! Note that getf latency increased. This means that if a loop is
171 .skip	32	// makes the loop body aligned at 64-byte boundary
183 	brp.loop.imp	.L_bn_add_words_ctop,.L_bn_add_words_cend-16
224 .skip	32	// makes the loop body aligned at 64-byte boundary
236 	brp.loop.imp	.L_bn_sub_words_ctop,.L_bn_sub_words_cend-16
283 .skip	32	// makes the loop body aligned at 64-byte boundary
306 	brp.loop.imp	.L_bn_mul_words_ctop,.L_bn_mul_words_cend-16
317 // This loop spins in 2*(n+12) ticks. It's scheduled for data in Itanium
320 // ldf8. The loop is not scalable and shall run in 2*(n+12) even on
321 // "wider" IA-64 implementations. It's a trade-off here. n+24 loop
326 // this very instruction sequence in bn_mul_add_words loop which in
359 // of Intel the following loop is commented out? Indeed, it looks so
363 // The loop therefore spins at the latency of xma minus 1, or in other
364 // words at 6*(n+4) ticks:-( Compare to the "production" loop above
397 .skip	48	// makes the loop body aligned at 64-byte boundary
412 	brp.loop.imp	.L_bn_mul_add_words_ctop,.L_bn_mul_add_words_cend-16
423 // This loop spins in 3*(n+10) ticks on Itanium and in 2*(n+10) on
465 .skip	32	// makes the loop body aligned at 64-byte boundary 
486 	brp.loop.imp	.L_bn_sqr_words_ctop,.L_bn_sqr_words_cend-16
497 // will appear larger than loss on "wider" IA-64, then the loop should
OpenGrok