More on fndng a Sngle Number o ndcae Overall Performance of a Sue Lzy Kuran John Elecrcal and Compuer Engneerng Deparmen The Unvery of Texa a Aun ljohn@ece.uexa.edu The opc of fndng a ngle number o ummarze overall performance over a benchmark ue connung o be a dffcul ue year afer Smh paper []. Whle gnfcan ngh no he problem ha been provded by Smh [], Henneey and Paeron [], Cragon [], ec, he reearch communy ll eem o be unclear on he correc mean o ue for dfferen performance merc. How hould merc obaned from ndvdual benchmark be aggregaed o preen a ummary of he performance over he enre ue? Wha are vald cenral endency meaure over he whole benchmark ue for peedup, CP, PC, MPS, MFLOPS, cache m rae, cache h rae, branch mpredcon rae, ec? Arhmec mean ha been oued o be approprae for me-baed merc, whle harmonc mean oued o be approprae for rae-baed merc. cache m rae a rae-baed merc and hence harmonc mean approprae? Geomerc mean a vald meaure of cenral endency for rao or dmenonle quane [], however, alo adved ha geomerc mean hould no be ued for ummarzng any performance meaure [,]., whch a popular merc n mo archecure paper o ndcae performance enhancemen by he propoed archecure dmenonle and a rao-baed meaure. Wha wll be an approprae meaure o ummarze peedup from ndvdual benchmark? known ha weghed mean hould be ued f he benchmark are no equally weghed. Wha doe equally weghed mean? Doe equal wegh mean- each benchmark run once, each benchmark equally lkely o be n a workload of he uer, all benchmark have an equal number of nrucon or ha all benchmark run for equal number of cycle? Whenever wo machne are compared, here alway he queon wheher he benchmark are equally weghed n he baelne machne or he enhanced machne. And noe ha boh canno be rue unle each benchmark enhanced equally. Th paper provde ome anwer o he above queon n he conex of aggregang merc from ndvdual benchmark n a benchmark ue. We how ha weghed arhmec or harmonc mean can be ued nerchangeably and correcly provded he approprae wegh are appled. We gve mahemacal proof o eablh h. MPS a an example Le u ar wh MPS a an example merc. Le aume ha he benchmark ue compoed of n benchmark and her ndvdual MPS are known. We know ha he overall MPS of he enre ue he oal nrucon coun n mllon dvded by he oal me aken for execuon. Hence, Overall MPS n n () where he nrucon coun of each componen benchmark (n mllon) and he execuon me of each benchmark. Aume MPS he MPS rang of each ndvdual benchmark. The overall MPS eenally he MPS when he n benchmark are condered a par of a bg applcaon. We fnd ha he overall MPS of he ue can be obaned by compung a Weghed Harmonc Mean (W.H.M) of he MPS of he ndvdual benchmark weghed accordng o he nrucon coun or by compung a Weghed Arhmec Mean (W.A.M) of he ndvdual MPS wh wegh correpondng o he execuon me pen n each benchmark n he ue. Le u eablh h mahemacally. The wegh of he ndvdual benchmark accordng o nrucon coun ( ω ) are,, ec. All ummaon n h paper are for he n benchmark a n eq., and hence, for compacne we are gong o ju ue he ummaon gn from now on. The wegh of he
ndvdual benchmark accordng o execuon me ( ω ) are,, ec. Now, W.H.M. wh wegh correpondng o nrucon coun, where ω he wegh of ω MPS benchmark accordng o nrucon coun..() MPS MPS, MPS + MPS +... whch, we know overall MPS accordng o equaon. Now, can be een ha he ame reul can be obaned by akng a weghed arhmec mean of he ndvdual MPS wh wegh correpondng o he execuon me pen n each benchmark n he ue. W.A.M. weghed wh me ω MPS, where ω he wegh accordng o execuon me MPS + MPS + + K [ ] Overall MPS + Thu, f he ndvdual MPS and he relave wegh of nrucon coun or execuon me are known, he overall can be compued. Table llurae an example benchmark ue wh benchmark, her ndvdual nrucon coun, ndvdual execuon me and he ndvdual MPS. Le u calculae he overall MPS of he ue drecly from he overall nrucon coun and he overall execuon me. Overall nrucon coun000 mllon Overall execuon me0 ec Overall MPS 000/000 Table : An example benchmark ue wh benchmark, her ndvdual nrucon coun, ndvdual execuon me and ndvdual MPS nrucon Coun (n mllon) 0 00 000 Tme (ec) ndvdual MPS 00 00 We can alo calculae he overall MPS from he ndvdual MPS and he wegh of he ndvdual benchmark. Wegh of he benchmark wh repec o -coun 0/000, /000, 00/000, 000/000, /000 0.: 0.0 : 0. : 0. : 0. Wegh of benchmark wh repec o me 0.: 0. : 0. : 0. : 0.
WHM of ndvdual MPS (weghed wh -coun) /(0./+0.0/+0./00+ 0./00 + 0./) 00 WAM of ndvdual MPS (weghed wh me) *0. +*0.+00*0.+00*0.+*0. 00 Thu, eher weghed arhmec mean or weghed harmonc mean can be ued o fnd overall mean, f he approprae wegh can be properly appled. can alo be een ha he mple (unweghed) arhmec mean or mple (unweghed) harmonc mean are no correc, f he arge workload he um of he fve componen benchmark. Unweghed AM of ndvdual MPS 90 Unweghed HM of ndvdual MPS.8 Neher of hee are ndcave of he overall MPS. Of coure, he benchmark are no equally weghed n he ue, and hence he unweghed mean are no correc. n general, f a merc obaned by dvdng A by B, f A weghed equally beween he benchmark, harmonc mean correc and f B weghed equally among he componen benchmark n a ue, arhmec mean correc whle calculang he cenral endency of he merc obaned by A/B. n oher word, harmonc mean wh wegh correpondng o he meaure n he numeraor or arhmec mean wh wegh correpondng o he meaure n he denomnaor vald, when ryng o fnd he aggregae meaure from he value of he meaure n he ndvdual benchmark. We ue h prncple o fnd he correc mean for a varey of performance merc. Th hown n Table. Somehow here eem o be an mpreon ha arhmec mean naïve and uele. Arhmec mean meanngle for MPS or MFLOPS when each benchmark conan equal number of nrucon or equal number of floang pon operaon, however, meanngful n many uaon. Conder he followng uaon: A compuer run dgal logc mulaon for half he me (n a day) and run chemry code for he oher half of he day. A benchmark ue creaed conng of benchmark, one of each knd. acheve MPS on he dgal logc mulaon benchmark and acheve MPS on he chemry benchmark. The overall MPS of he arge yem he arhmec mean of he MPS from he wo ndvdual benchmark and no he harmonc mean. Table : The mean o be ued o fnd aggregae meaure over a benchmark ue from meaure correpondng o ndvdual benchmark n a ue Meaure Vald cenral endency for ummarzed meaure over he ue PC W.A.M. weghed wh cycle W.H.M. weghed wh -coun CP W.A.M. weghed wh -coun W.H.M. weghed wh cycle W.A.M. weghed wh execuon me rao n mproved yem W.H.M. weghed wh execuon me rao n he baelne yem MPS W.A.M. weghed wh me W.H.M. weghed wh -coun MFLOPS W.A.M. weghed wh me W.H.M. weghed wh FLOP coun Cache h rae W.A.M. weghed wh number of reference o cache W.H.M. weghed wh number of h Cache me per W.A.M. weghed wh -coun W.H.M weghed wh number of me nrucon Branch mpredcon W.A.M. weghed wh branch coun W.H.M. weghed wh number of mpredcon rae per branch Normalzed execuon me W.A.M. weghed wh execuon me n yem condered a bae W.H.M. weghed wh execuon me n he yem beng evaluaed Tranacon per W.A.M. weghed wh exec me W.H.M. weghed wh proporon of ranacon mnue for each benchmark A/B W.A.M. weghed wh B W.H.M. weghed wh A
: a very commonly ued merc n he archecure communy; perhap, he ngle mo frequenly ued merc. Le u conder he example n Table. Table : An example benchmark ue wh benchmark, her ndvdual execuon me on yem under comparon and he ndvdual peedup of he benchmark Tme on baelne yem 0 00 000 Tme on enhanced yem 00 Toal me on baelne yem000ec Toal me on enhanced yem800 ec f he enre benchmark ue run on he baelne yem and enhanced yem, we know ha he Overall peedup000/800. ndvdual 0.8. Now, gven he ndvdual peedup, whch mean hould be ued o fnd he overall peedup? We conend ha he overall peedup can be found eher by arhmec or harmonc mean wh approprae wegh. One need o know he relave wegh (wh repec o execuon me) of he dfferen benchmark on he baelne and/or enhanced yem. Wegh of he benchmark on he baelne yem 0/000, /000, 00/000, 000/000, /000 Wegh of he benchmark on he enhanced yem /800, /800, /800, /800, 00/800 WHM of ndvdual peedup (weghed wh me on he baelne machne) /(0/(000*) + /(000*) + 00/(000*) + 000/(000*0.8) + /(000*.)) /(/000+/000+/000+/000+00/000) /(800/000) 000/800. WAM of ndvdual peedup (weghed wh me on he enhanced machne) */800+*/800+*/800+0.8*/800+. *00/800(0/800+/800+00/800+000/800 +/800) 000/800. Thu, f peedup of a yem wh repec o a baelne yem avalable for everal program of a benchmark ue, he W.H.M of he peedup for he ndvdual benchmark wh wegh correpondng o he execuon me n he baelne yem or he W. A. M of he peedup for he ndvdual benchmark wh wegh correpondng o he execuon me n he mproved yem can yeld he overall peedup over he enre ue. Now, conder a uaon a n able. Table : An example where he unweghed A. M. of he ndvdual peedup or he weghed H. M. he correc aggregae peedup Tme on baelne yem 00 00 00 80 Tme on enhanced yem 00 00 00 00 00 ndvdual 0.8. Baed on execuon me, we know ha he overall peedup 90/0, whch equal o he unweghed arhmec mean of he ndvdual peedup. A you can ee each program had equal wegh on he enhanced machne. Th ndcave of a condon where he workload no fxed, bu all ype of workload are equally probable on he arge yem. Pleae noe ha he ame correc anwer can be obaned f harmonc mean of ndvdual peedup wh wegh correpondng o execuon me on he baelne yem ued. Nex, le u conder a uaon a n able. Table : An example where he unweghed H.M. of he ndvdual peedup or he weghed A.M. he correc aggregae peedup Tme on baelne yem 00 00 00 00 00 Tme on enhanced yem 00 80 ndvdual 0.8.
The overall peedup 0/80, baed on he oal execuon me n he wo yem. can alo be derved from he ndvdual peedup a he unweghed harmonc mean of he ndvdual peedup. n h cae, he unweghed harmonc mean correc becaue he program are equally weghed on he baelne yem. may be noed ha he ame correc anwer can be obaned f arhmec mean of he ndvdual peedup wh wegh correpondng o execuon me on he enhanced yem ued. One mgh noce ha he average peedup heavly wayed by he relave duraon of he benchmark. clear ha he relave execuon me of he benchmark n a ue are mporan. However, how much hough ha gone no decdng he relave duraon of execuon of he dfferen benchmark? n he SPECNT000, he baelne runnng me are 00, 00, 00, 800, 000, 800, 00, 800, 00, 900, 0 and 000 me un for gzp, vpr, gcc, mcf, crafy, parer, eon, perlbmk, gap, vorex, bzp and wolf repecvely []. Apparenly hee runnng me were derved baed on he me hee program ook on a reference machne. run n enrey on he new yem, hen W. H. M. wh wegh of execuon me of each of he benchmark on he baelne yem hould be ued. Th repreen he condon where he arge workload exacly he ame a he SPEC benchmark ue. f one argue ha he relave duraon of he SPEC benchmark n he SPEC ue (a dcaed by SPEC) mean nohng o hm/her, he unweghed harmonc mean of peedup can be ued. f one nereed n knowng he peedup f an magnary workload wh each ype of SPEC program run for equal par of he day on he arge yem, he A. M. of he ndvdual peedup hould be ued. So f omeone ummarze ndvdual MPS ung unweghed harmonc mean, wha doe ndcae? a vald ndcaor of he overall MPS of he ue, f every benchmark had equal number of nrucon. Snce eher arhmec or harmonc mean wh correpondng wegh approprae for mo merc, we can ummarze he condon under whch unweghed arhmec and harmonc mean are vald for each merc. Table 6 preen h. Wha mean hould be ued for peedup from SPEC benchmark? f he aggregae number of nere he peedup, and f he exac ame SPEC benchmark ue Table 6: Condon under whch unweghed arhmec and harmonc mean are vald ndcaor of overall performance To ummarze meaure over he ue Meaure When AM vald? When H.M. vald? PC f equal cycle n each benchmark f equal work (-coun) n each benchmark CP f equal -coun n each benchmark f equal cycle n each benchmark f equal execuon me n each benchmark n he mproved yem f equal execuon me n each benchmark n he baelne yem MPS f equal me n each benchmark f equal -coun n each benchmark MFLOPS f equal me n each benchmark f equal FLOPS n each benchmark Cache h rae f equal number of reference o cache for f equal number of cache h n each benchmark each benchmark Cache me per f equal -coun n each benchmark f equal number of me n each benchmark nrucon Branch mpredcon rae per branch f equal number of branche n each benchmark f equal number of mpredcon n each benchmark Normalzed execuon me f equal execuon me n each benchmark n he yem condered a bae f equal execuon me n each benchmark n he yem beng evaluaed Tranacon per f equal me n each benchmark f equal number of ranacon n each benchmark mnue A/B f B are equal f A are equal
Smh ue he meanng equal work or equal number of floang pon operaon for equal wegh []. Under ha condon, Table 6 doe llurae ha harmonc mean he rgh mean for MFLOPS. Weghed Harmonc Mean wh wegh correpondng o number of floang pon operaon or W. A. M wh wegh correpondng o he execuon me of he benchmark correcly yeld he overall MFLOPS. deally, he runnng me of benchmark hould be ju enough for performance merc o ablze. Then, whle aggregang he merc, each program hould be weghed for whaever fracon of me wll run n he uer arge workload. For nance, f program a compler, program a dgal mulaon, and program compreon, for a uer whoe acual workload dgal mulaon for 90% of he day, and % complaon and % compreon, WAM wh wegh 0.0, 0.9, 0.0 wll yeld a vald overall MPS on he arge workload. When one doe no know he end uer acual applcaon-mx, f he aumpon ha each ype of benchmark run for equal perod of me, fndng a mple (unweghed) arhmec mean of MPS no an nvald approach. appear ha everyhng compuer archec deal wh can be covered by arhmec or harmonc mean. So wha geomerc mean ueful for? Cragon [] provde an example where geomerc mean can be ued o fnd he mean gan per age of a mul-age amplfer, when he gan of he ndvdual age are gven. He alo llurae ha, f mprovemen n CP and clock perod are gven, he mean mprovemen for hee wo degn change can be found by he geomerc mean. Snce execuon me dependen on he produc of he wo merc condered here, he mean mprovemen per change can be evaluaed by he geomerc mean. Bu geomerc mean of performance merc derved from componen benchmark canno be ued o ummarze performance over an enre ue. A general rule ha arhmec or harmonc mean make ene when he componen quane are ummed o repreen he aggregae uaon. The geomerc mean meanngful when he componen quane are mulpled o repreen he aggregae uaon. Snce execuon me of componen benchmark are added o fnd he overall execuon me, arhmec or harmonc mean hould be ued. wh repec o a baelne yem avalable for everal program of a benchmark ue, he W.H.M of he peedup for he ndvdual benchmark wh wegh correpondng o he execuon me n he baelne yem or he W. A. M of he peedup for he ndvdual benchmark wh wegh correpondng o he execuon me n he mproved yem can yeld he overall peedup over he enre ue. Geomerc mean doe no repreen anyhng meanngful whle aggregang performance merc over benchmark n a ue. Acknowledgemen: The feedback from Jm Smh, Davd Llja, Doug Burger and my uden n he Laboraory of Compuer Archecure helped o mprove h manucrp. The auhor reearch uppored n par by he Naonal Scence Foundaon under gran no. 00, and by AMD, nel, BM and Moorola Corporaon. Reference [] J. E. Smh, Characerzng Compuer Performance wh a Sngle Number, Communcaon of ACM, (0):0-06, Ocober 988 [] Paeron and Henney, Compuer Archecure: The Hardware/Sofware Approach, Morgan Kaufman Publher [] H. Cragon, Compuer Archecure and mplemenaon, Cambrdge Unvery Pre [] Davd Llja, Meaurng Compuer Performance: A Praconer' Gude, Cambrdge Unvery Pre, 000. [] The CPU000 Reul publhed by SPEC a: hp://www.pec.org/cpu000/reul/cpu000.hml#specn n ummary, poble o ummarze performance over a benchmark ue by ung arhmec or harmonc mean wh approprae wegh. f he merc of nere obaned by dvdng A by B, f A weghed equally beween he benchmark, harmonc mean correc and f B weghed equally among he componen benchmark n a ue, arhmec mean correc whle ummarzng he merc over he enre ue. f peedup of a yem