Items of some interest…

These are my recent Pin​board​.in links:

Items of some interest…

These are my recent Pin​board​.in links:

  • [1110.0585] Dis­crim­i­nately Decreas­ing Dis­crim­inabil­ity with Learned Image Filters

    “In machine learn­ing and com­puter vision, input images are often fil­tered to increase data dis­crim­inabil­ity. In some sit­u­a­tions, how­ever, one may wish to pur­posely decrease dis­crim­inabil­ity of one clas­si­fi­ca­tion task (a “dis­trac­tor” task), while simul­ta­ne­ously pre­serv­ing infor­ma­tion rel­e­vant to another (the task-​​of-​​interest): For exam­ple, it may be impor­tant to mask the iden­tity of per­sons con­tained in face images before sub­mit­ting them to a crowd­sourc­ing site (e.g., Mechan­i­cal Turk) when label­ing them for cer­tain facial attrib­utes. Another exam­ple is inter-​​dataset gen­er­al­iza­tion: when train­ing on a dataset with a par­tic­u­lar covari­ance struc­ture among mul­ti­ple attrib­utes, it may be use­ful to sup­press one attribute while pre­serv­ing another so that a trained clas­si­fier does not learn spu­ri­ous cor­re­la­tions between attrib­utes. In this paper we present an algo­rithm that finds opti­mal fil­ters to give high dis­crim­inabil­ity to one task while simul­ta­ne­ously giv­ing low dis­crim­inabil­ity to a dis­trac­tor task. We present results show­ing the effec­tive­ness of the pro­posed tech­nique on both sim­u­lated data and nat­ural face images.”

    machine-​​learning data-​​preparation fil­ter­ing algo­rithms nudge-​​targets
  • [1110.5183] Dif­fu­sion of Infor­ma­tion in Robot Swarms

    “This work is devoted to com­mu­ni­ca­tion approaches, which spread infor­ma­tion in robot swarms. These mech­a­nisms are use­ful for large-​​scale sys­tems and also for such cases when a lim­ited com­mu­ni­ca­tion equip­ment does not allow rout­ing of infor­ma­tion pack­ages. We focus on two approaches such as vir­tual fields and epi­demic algo­rithms, dis­cuss sev­eral aspects of hard­ware imple­men­ta­tion and demon­strate exper­i­ments per­formed with micro­ro­bots “Jasmine”.”

    agent-​​based swarms com­mu­ni­ca­tion complex-​​systems epi­demi­ol­ogy dynamical-​​systems exper­i­ment
  • [1109.5229] Dis­trib­uted Algo­rithms for Opti­mal Power Flow Problem

    “Opti­mal power flow (OPF) is an impor­tant prob­lem for power gen­er­a­tion and it is in gen­eral non-​​convex. With the employ­ment of renew­able energy, it will be desir­able if OPF can be solved very effi­ciently so its solu­tion can be used in real time. With some spe­cial net­work struc­ture, e.g. trees, the prob­lem has been shown to have a zero dual­ity gap and the con­vex dual prob­lem yields the opti­mal solu­tion. In this paper, we pro­pose a pri­mal and a dual algo­rithm to coor­di­nate the smaller sub­prob­lems decom­posed from the con­vex­i­fied OPF. We can arrange the sub­prob­lems to be solved sequen­tially and cumu­la­tively in a cen­tral node or solved in par­al­lel in dis­trib­uted nodes. We test the algo­rithms on IEEE radial dis­tri­b­u­tion test feed­ers, some ran­dom tree-​​structured net­works, and the IEEE trans­mis­sion sys­tem bench­marks. Sim­u­la­tion results show that the com­pu­ta­tion time can be improved dra­mat­i­cally with our algo­rithms over the cen­tral­ized approach of solv­ing the prob­lem with­out decom­po­si­tion, espe­cially in tree-​​structured prob­lems. The com­pu­ta­tion time grows lin­early with the prob­lem size with the cumu­la­tive approach while the dis­trib­uted one can have size-​​independent com­pu­ta­tion time.”

    operations-​​research algo­rithms network-​​theory infra­struc­ture com­po­si­tion nudge-​​targets
  • Coupon collector’s prob­lem — Wikipedia, the free encyclopedia

    “In prob­a­bil­ity the­ory, the coupon collector’s prob­lem describes the “col­lect all coupons and win” con­tests. It asks the fol­low­ing ques­tion: Sup­pose that there are n coupons, from which coupons are being col­lected with replace­ment. What is the prob­a­bil­ity that more than t sam­ple tri­als are needed to col­lect all n coupons? An alter­na­tive state­ment is: Given n coupons, how many coupons do you expect you need to draw with replace­ment before hav­ing drawn each coupon at least once. The math­e­mat­i­cal analy­sis of the prob­lem reveals that the expected num­ber of tri­als needed grows as Θ(nlog(n)). For exam­ple, when n = 50 it takes about 225 tri­als to col­lect all 50 coupons.”

    probability-​​theory slic-​​representation nudge-​​targets
  • [1107.2379] Data Sta­bil­ity in Clus­ter­ing: A Closer Look

    “This paper con­sid­ers the model intro­duced by Bilu and Linial (2010), who study prob­lems for which the opti­mal clus­ter­ing does not change when the dis­tances are per­turbed by mul­ti­plica­tive fac­tors. They show that even when a prob­lem is NP-​​hard, it is some­times pos­si­ble to obtain polynomial-​​time algo­rithms for instances resilient to large per­tur­ba­tions, e.g. on the order of $O(sqrt{n})$ for max-​​cut clus­ter­ing. Awasthi et al. (2010) extend this line of work by con­sid­er­ing center-​​based objec­tives, and Bal­can and Liang (2011) con­sider the $k$-median and min-​​sum objec­tives, giv­ing effi­cient algo­rithms for instances resilient to cer­tain con­stant mul­ti­plica­tive per­tur­ba­tions. Here, we are moti­vated by the ques­tion of to what extent these assump­tions can be relaxed while allow­ing for effi­cient algo­rithms. We show there is lit­tle room to improve these results by giv­ing NP-​​hardness lower bounds for both the $k$-median and min-​​sum objec­tives. On the other hand, we show that mul­ti­plica­tive resilience para­me­ters, even only on the order of $Theta(1)$, can be so strong as to make the clus­ter­ing prob­lem triv­ial, and we exploit these assump­tions to present a sim­ple one pass stream­ing algo­rithm for the $k$-median objec­tive. We also con­sider a model of addi­tive per­tur­ba­tions and give a cor­re­spon­dence between addi­tive and mul­ti­plica­tive notions of sta­bil­ity. Our results pro­vide a close exam­i­na­tion of the con­se­quences of assum­ing, even con­stant, sta­bil­ity in data.”

    clus­ter­ing sta­tis­tics algo­rithms robust­ness nudge-​​targets
  • [1109.5641] Energy-​​Aware Load Bal­anc­ing in Con­tent Deliv­ery Networks

    “Internet-​​scale dis­trib­uted sys­tems such as con­tent deliv­ery net­works (CDNs) oper­ate hun­dreds of thou­sands of servers deployed in thou­sands of data cen­ter loca­tions around the globe. Since the energy costs of oper­at­ing such a large IT infra­struc­ture are a sig­nif­i­cant frac­tion of the total oper­at­ing costs, we argue for redesign­ing CDNs to incor­po­rate energy opti­miza­tions as a first-​​order prin­ci­ple. We pro­pose tech­niques to turn off CDN servers dur­ing peri­ods of low load while seek­ing to bal­ance three key design goals: max­i­mize energy reduc­tion, min­i­mize the impact on client-​​perceived ser­vice avail­abil­ity (SLAs), and limit the fre­quency of on-​​off server tran­si­tions to reduce wear-​​and-​​tear and its impact on hard­ware reli­a­bil­ity. We pro­pose an opti­mal offline algo­rithm and an online algo­rithm to extract energy sav­ings both at the level of local load bal­anc­ing within a data cen­ter and global load bal­anc­ing across data cen­ters. We eval­u­ate our algo­rithms using real pro­duc­tion work­load traces from a large com­mer­cial CDN. Our results show that it is pos­si­ble to reduce the energy con­sump­tion of a CDN by more than 55% while ensur­ing a high level of avail­abil­ity that meets cus­tomer SLA require­ments and incur­ring an aver­age of one on-​​off tran­si­tion per server per day. Fur­ther, we show that keep­ing even 10% of the servers as hot spares helps absorb load spikes due to global flash crowds with lit­tle impact on avail­abil­ity SLAs. Finally, we show that redis­trib­ut­ing load across prox­i­mal data cen­ters can enhance ser­vice avail­abil­ity sig­nif­i­cantly, but has only a mod­est impact on energy savings.”

    algo­rithms load-​​balancing infra­struc­ture nudge-​​targets operations-​​research
  • [1110.0264] Face Recog­ni­tion using Opti­mal Rep­re­sen­ta­tion Ensemble

    “Recently, the face rec­og­niz­ers based on lin­ear rep­re­sen­ta­tions have been shown to deliver state-​​of-​​the-​​art per­for­mance. In real-​​world appli­ca­tions, how­ever, face images usu­ally suf­fer from expres­sions, dis­guises and ran­dom occlu­sions. The prob­lem­atic facial parts under­mine the valid­ity of the linear-​​subspace assump­tion and thus the recog­ni­tion per­for­mance dete­ri­o­rates sig­nif­i­cantly. In this work, we address the prob­lem in a learning-​​inference-​​mixed fash­ion. By observ­ing that the linear-​​subspace assump­tion is more reli­able on cer­tain face patches rather than on the holis­tic face, some Bayesian Patch Rep­re­sen­ta­tions (BPRs) are ran­domly gen­er­ated and inter­preted accord­ing to the Bayes’ the­ory. We then train an ensem­ble model over the patch-​​representations by min­i­miz­ing the empir­i­cal risk w.r.t the “leave-​​one-​​out mar­gins”. The obtained model is termed Opti­mal Rep­re­sen­ta­tion Ensem­ble (ORE), since it guar­an­tees the opti­mal­ity from the per­spec­tive of Empir­i­cal Risk Min­i­miza­tion. To han­dle the unknown pat­terns in test faces, a robust ver­sion of BPR is pro­posed by tak­ing the non-​​face cat­e­gory into con­sid­er­a­tion. Equipped with the Robust-​​BPRs, the infer­ence abil­ity of ORE is increased dra­mat­i­cally and sev­eral record-​​breaking accu­ra­cies (99.9% on Yale-​​B and 99.5% on AR) and desir­able effi­cien­cies (below 20 ms per face in Mat­lab) are achieved. It also over­whelms other mod­u­lar heuris­tics on the faces with ran­dom occlu­sions, extreme expres­sions and dis­guises. Fur­ther­more, to accom­mo­date immense BPRs sets, a boosting-​​like algo­rithm is also derived. The boosted model, a.k.a Boosted-​​ORE, obtains sim­i­lar per­for­mance to its pro­to­type. Besides the empir­i­cal supe­ri­or­i­ties, two desir­able fea­tures of the pro­posed meth­ods, namely, the training-​​determined model-​​selection and the data-​​weight-​​free boost­ing pro­ce­dure, are also the­o­ret­i­cally verified.”

    image-​​analysis face-​​recognition algo­rithms nudge-​​targets
  • [1110.0957] Dic­tio­nary Learn­ing for Deblur­ring and Dig­i­tal Zoom

    “This paper pro­poses a novel approach to image deblur­ring and dig­i­tal zoom­ing using sparse local mod­els of image appear­ance. These mod­els, where small image patches are rep­re­sented as lin­ear com­bi­na­tions of a few ele­ments drawn from some large set (dic­tio­nary) of can­di­dates, have proven well adapted to sev­eral image restora­tion tasks. A key to their suc­cess has been to learn dic­tio­nar­ies adapted to the recon­struc­tion of small image patches. In con­trast, recent works have pro­posed instead to learn dic­tio­nar­ies which are not only adapted to data recon­struc­tion, but also tuned for a spe­cific task. We intro­duce here such an approach to deblur­ring and dig­i­tal zoom, using pairs of blurry/​sharp (or low-​​/​high-​​resolution) images for train­ing, as well as an effec­tive sto­chas­tic gra­di­ent algo­rithm for solv­ing the cor­re­spond­ing opti­miza­tion task. Although this learn­ing prob­lem is not con­vex, once the dic­tio­nar­ies have been learned, the sharp/​high-​​resolution image can be recov­ered via con­vex opti­miza­tion at test time. Exper­i­ments with syn­thetic and real data demon­strate the effec­tive­ness of the pro­posed approach, lead­ing to state-​​of-​​the-​​art per­for­mance for non-​​blind image deblur­ring and dig­i­tal zoom.”

    image-​​processing image-​​analysis algo­rithms nudge-​​targets hyper­res­o­lu­tion
  • [1110.0477] Dis­trib­uted Evo­lu­tion­ary Graph Partitioning

    “We present a novel dis­trib­uted evo­lu­tion­ary algo­rithm, KaFF­PaE, to solve the Graph Par­ti­tion­ing Prob­lem, which makes use of KaFFPa (Karl­sruhe Fast Flow Par­ti­tioner). The use of our mul­ti­level graph par­ti­tioner KaFFPa pro­vides new effec­tive crossover and muta­tion oper­a­tors. By com­bin­ing these with a scal­able com­mu­ni­ca­tion pro­to­col we obtain a sys­tem that is able to improve the best known par­ti­tion­ing results for many inputs in a very short amount of time. For exam­ple, in Walshaw’s well known bench­mark tables we are able to improve or recom­pute 76% of entries for the tables with 1%, 3% and 5% imbalance.”

    algo­rithms graph-​​theory evolutionary-​​algorithms nudge-​​targets
  • [1104.1605] Effi­cient Top-​​K Retrieval in Online Social Tag­ging Networks

    “We con­sider in this paper top-​​k query answer­ing in social tag­ging sys­tems, also known as folk­sonomies. This prob­lem requires a sig­nif­i­cant depar­ture from exist­ing, socially agnos­tic tech­niques. In a network-​​aware con­text, one can (and should) exploit the social links, which can indi­cate how users relate to the seeker and how much weight their tag­ging actions should have in the result build-​​up. We pro­pose an algo­rithm that has the poten­tial to scale to cur­rent appli­ca­tions. While the prob­lem has already been con­sid­ered in pre­vi­ous lit­er­a­ture, this was done either under strong sim­pli­fy­ing assump­tions or under choices that can­not scale to even moderate-​​size real world appli­ca­tions. We first con­sider a key aspect of the prob­lem, which is access­ing the clos­est or most rel­e­vant users for a given seeker. We describe how this can be done on the fly (with­out any pre-​​computations) for sev­eral pos­si­ble choices — arguably the most nat­ural ones — of prox­im­ity com­pu­ta­tion in a user net­work. Based on this, our top-​​k algo­rithm is sound and com­plete, while address­ing the scal­a­bil­ity issues of the exist­ing ones. Impor­tantly, our tech­nique is instance opti­mal in the case when the search relies exclu­sively on the social weight of tag­ging actions. To fur­ther reduce response times, we then con­sider direc­tions for effi­ciency by approx­i­ma­tion. Exten­sive exper­i­ments on real world data show that our tech­niques can dras­ti­cally improve the response time, with­out sac­ri­fic­ing precision.”

    folk­son­omy tag­ging network-​​theory search-​​algorithms nudge-​​targets data-​​access

Items of some interest…

These are my recent Pin​board​.in links:

  • About — Hamil­ton Wood Type & Print­ing Museum

    “The Museum, at 40,000 square feet, is no doubt one of the largest fully func­tional work­shops in the world. Not only do the thou­sands of vis­i­tors who come through every year get to see how wood type was made at the foundry, stu­dents, artists, typog­ra­phers and design­ers visit to take work­shops and actu­ally put their hands on and use the col­lec­tion to cre­ate works of art and schol­ar­ship in our press­room at the Museum. To be able to use the type and cuts and a press to make a print can broaden a design student’s under­stand­ing of typog­ra­phy and color and lay­out, and artists make work with wood type that would have sur­prised and delighted Ed Hamil­ton, the company’s founder.”

    field-​​trips to-​​visit museum typog­ra­phy road-​​trip