Items of some interest:

These are my recent Pin​board​.in links:

  • [1203.1644] The B36/​S125 “2×2″ Life-​​Like Cel­lu­lar Automaton

    “The B36/​S125 (or “2x2”) cel­lu­lar automa­ton is one that takes place on a 2D square lat­tice much like Conway’s Game of Life. Although it exhibits high-​​level behav­iour that is sim­i­lar to Life, such as chaotic but even­tu­ally sta­ble evo­lu­tion and the exis­tence of a nat­ural diag­o­nal glider, the indi­vid­ual objects that the rule con­tains gen­er­ally look very dif­fer­ent from their Life coun­ter­parts. In this arti­cle, a his­tory of notable dis­cov­er­ies in the 2×2 rule is pro­vided, and the fun­da­men­tal pat­terns of the automa­ton are described. Some the­o­ret­i­cal results are derived along the way, includ­ing a proof that the speed lim­its for diag­o­nal and orthog­o­nal space­ships in this rule are c/​3 and c/​2, respec­tively. A Mar­go­lus block cel­lu­lar automa­ton that 2×2 emu­lates is inves­ti­gated, and in par­tic­u­lar a fam­ily of oscil­la­tors made up entirely of 2 x 2 blocks are ana­lyzed and used to show that there exist oscil­la­tors with period 2m(2k — 1) for any inte­gers m,k geq 1.”

    cellular-​​automata artificial-​​life discrete-​​mathematics emer­gence mathematical-​​recreations nudge-​​targets
  • [1203.1034] Gen­eral Com­plex Poly­no­mial Root Solver and Its Fur­ther Opti­miza­tion for Binary Microlenses

    “We present a new algo­rithm to solve poly­no­mial equa­tions, and pub­lish its code, which is 1.6−3 times faster than the ZROOTS sub­rou­tine that is com­mer­cially avail­able from Numer­i­cal Recipes, depend­ing on appli­ca­tion. The largest improve­ment, when com­pared to naive solvers, comes from a fail-​​safe pro­ce­dure that per­mits us to skip the major­ity of the cal­cu­la­tions in the great major­ity of cases, with­out risk­ing cat­a­strophic fail­ure in the few cases that these are actu­ally required. Sec­ond, we iden­tify a dis­crim­i­nant that enables a ratio­nal choice between Laguerre’s Method and Newton’s Method (or a new inter­me­di­ate method) on a case-​​by-​​case basis. We briefly review the his­tory of root solv­ing and demon­strate that “Newton’s Method” was dis­cov­ered nei­ther by New­ton (1671) nor by Raph­son (1690), but only by Simp­son (1740). Some of the argu­ments lead­ing to this con­clu­sion were first given by the British his­to­rian of sci­ence Nick Koller­strom in 1992, but these do not appear to have pen­e­trated the astro­nom­i­cal com­mu­nity. Finally, we argue that Numer­i­cal Recipes should vol­un­tar­ily sur­ren­der its copy­right pro­tec­tion for non-​​profit appli­ca­tions, despite the fact that, in this par­tic­u­lar case, such pro­tec­tion was the major stim­u­lant for devel­op­ing our improved algorithm.”

    algo­rithms numerical-​​methods optics nudge-​​targets
  • [1203.1065] Sub­space clus­ter­ing of high-​​dimensional data: a pre­dic­tive approach

    “In sev­eral appli­ca­tion domains, high-​​dimensional obser­va­tions are col­lected and then analysed in search for nat­u­rally occur­ring data clus­ters which might pro­vide fur­ther insights about the nature of the prob­lem. In this paper we describe a new approach for par­ti­tion­ing such high-​​dimensional data. Our assump­tion is that, within each clus­ter, the data can be approx­i­mated well by a lin­ear sub­space esti­mated by means of a prin­ci­pal com­po­nent analy­sis (PCA). The pro­posed algo­rithm, Pre­dic­tive Sub­space Clus­ter­ing (PSC) par­ti­tions the data into clus­ters while simul­ta­ne­ously esti­mat­ing cluster-​​wise PCA para­me­ters. The algo­rithm min­imises an objec­tive func­tion that depends upon a new mea­sure of influ­ence for PCA mod­els. A penalised ver­sion of the algo­rithm is also described for car­ry­ing our simul­ta­ne­ous sub­space clus­ter­ing and vari­able selec­tion. The con­ver­gence of PSC is dis­cussed in detail, and exten­sive sim­u­la­tion results and com­par­isons to com­pet­ing meth­ods are pre­sented. The com­par­a­tive per­for­mance of PSC has been assessed on six real gene expres­sion data sets for which PSC often pro­vides state-​​of-​​art results.”

    ain’t-performance-space sta­tis­tics clus­ter­ing cure-​​for-​​dimensionality algo­rithms
  • [1203.1067] Cor­ti­cal free asso­ci­a­tion dynam­ics: dis­tinct phases of a latch­ing network

    “… The occur­rence and dura­tion of latch­ing dynam­ics is found through sim­u­la­tions to depend crit­i­cally on the strength of local attrac­tor states, expressed in the Potts model by a para­me­ter w. Here we describe with sim­u­la­tions and then ana­lyt­i­cally the bound­aries between dis­tinct phases of no latch­ing, of tran­sient and sus­tained latch­ing, deriv­ing a phase dia­gram in the plane w-​​T, where T param­e­trizes ther­mal noise effects. Impli­ca­tions for real cor­ti­cal dynam­ics are briefly reviewed in the conclusions.”

    neural-​​networks biologically-​​inspired dynamical-​​systems emergent-​​design nudge-​​targets
  • Fright­en­ingly Ambi­tious Startup Ideas

    “One of the more sur­pris­ing things I’ve noticed while work­ing on Y Com­bi­na­tor is how fright­en­ing the most ambi­tious startup ideas are. In this essay I’m going to demon­strate this phe­nom­e­non by describ­ing some. Any one of them could make you a bil­lion­aire. That might sound like an attrac­tive prospect, and yet when I describe these ideas you may notice you find your­self shrink­ing away from them.”

    every-​​idea-​​is-​​born star­tups inno­va­tion
  • [1201.6054] Attain­abil­ity in Repeated Games with Vec­tor Payoffs

    “We intro­duce the con­cept of attain­able sets of pay­offs in two-​​player repeated games with vec­tor pay­offs. A set of pay­off vec­tors is called {em attain­able} if player 1 can ensure that there is a finite hori­zon $T$ such that after time $T$ the dis­tance between the set and the cumu­la­tive pay­off is arbi­trar­ily small, regard­less of what strat­egy player 2 is using. This paper focuses on the case where the attain­able set con­sists of one pay­off vec­tor. In this case the vec­tor is called an attain­able vec­tor. We study prop­er­ties of the set of attain­able vec­tors, and char­ac­ter­ize when a spe­cific vec­tor is attain­able and when every vec­tor is attainable.”

    game-​​theory agent-​​based multiobjective-​​optimization nudge-​​targets
  • [1203.1080] Data Struc­ture Lower Bounds on Ran­dom Access to Grammar-​​Compressed Strings

    “In this paper we inves­ti­gate the prob­lem of build­ing a sta­tic data struc­ture that rep­re­sents a string s using space close to its com­pressed size, and allows fast access to indi­vid­ual char­ac­ters of s. …”

    gram­mars algo­rithms nudge-​​targets

Items of some interest…

These are my recent Pin​board​.in links:

  • [1108.4135] Complex-​​Valued Autoencoders

    “Autoen­coders are unsu­per­vised machine learn­ing cir­cuits whose learn­ing goal is to min­i­mize a dis­tor­tion mea­sure between inputs and out­puts. Lin­ear autoen­coders can be defined over any field and only real-​​valued lin­ear autoen­coder have been stud­ied so far. Here we study complex-​​valued lin­ear autoen­coders where the com­po­nents of the train­ing vec­tors and adjustable matri­ces are defined over the com­plex field with the $L_​2$ norm. We pro­vide sim­pler and more gen­eral proofs that unify the real-​​valued and complex-​​valued cases, show­ing that in both cases the land­scape of the error func­tion is invari­ant under cer­tain groups of trans­for­ma­tions. The land­scape has no local min­ima, a fam­ily of global min­ima asso­ci­ated with Prin­ci­pal Com­po­nent Analy­sis, and many fam­i­lies of sad­dle points asso­ci­ated with orthog­o­nal pro­jec­tions onto sub-​​space spanned by sub-​​optimal sub­sets of eigen­vec­tors of the covari­ance matrix. The the­ory yields sev­eral iter­a­tive, con­ver­gent, learn­ing algo­rithms, a clear under­stand­ing of the gen­er­al­iza­tion prop­er­ties of the trained autoen­coders, and can equally be applied to the hetero-​​associative case when exter­nal tar­gets are pro­vided. Par­tial results on deep archi­tec­ture as well as the dif­fer­en­tial geom­e­try of autoen­coders are also pre­sented. The gen­eral frame­work described here is use­ful to clas­sify autoen­coders and iden­tify gen­eral com­mon prop­er­ties that ought to be inves­ti­gated for each class, illu­mi­nat­ing some of the con­nec­tions between infor­ma­tion the­ory, unsu­per­vised learn­ing, clus­ter­ing, Heb­bian learn­ing, and auto encoders.”

    neural-​​networks machine-​​learning clas­si­fi­ca­tion encod­ing algo­rithms nudge-​​targets
  • [1108.5685] Pre­dict­ing flow rever­sals in chaotic nat­ural con­vec­tion using data assimilation

    “A sim­pli­fied model of nat­ural con­vec­tion, sim­i­lar to the Lorenz (1963) sys­tem, is com­pared to com­pu­ta­tional fluid dynam­ics sim­u­la­tions in order to test data assim­i­la­tion meth­ods and bet­ter under­stand the dynam­ics of con­vec­tion. The ther­mosyphon is rep­re­sented by a long time flow sim­u­la­tion, which serves as a ref­er­ence “truth”. Fore­casts are then made using the Lorenz-​​like model and syn­chro­nized to noisy and lim­ited obser­va­tions of the truth using data assim­i­la­tion. The result­ing analy­sis is observed to infer dynam­ics absent from the model when using short assim­i­la­tion win­dows. Fur­ther­more, chaotic flow rever­sal occur­rence and res­i­dency times in each rota­tional state are fore­cast using analy­sis data. Flow rever­sals have been suc­cess­fully fore­cast in the related Lorenz sys­tem, as part of a per­fect model exper­i­ment, but never in the pres­ence of sig­nif­i­cant model error or unob­served vari­ables. Finally, we pro­vide new details con­cern­ing the fluid dynam­i­cal processes present in the ther­mosyphon dur­ing these flow reversals.”

    chaos dynamical-​​systems exper­i­ment pre­dic­tion numerical-​​methods algo­rithms nudge-​​targets
  • [1108.1320] Com­pressed Matrix Multiplication

    “Moti­vated by the prob­lems of com­put­ing sam­ple covari­ance matri­ces, and of trans­form­ing a col­lec­tion of vec­tors to a basis where they are sparse, we present a sim­ple algo­rithm that com­putes an approx­i­ma­tion of the prod­uct of two n-​​by-​​n real matri­ces A and B.…”

    approx­i­ma­tion algo­rithms nudge-​​targets
  • [1110.5296] Com­put­ing a Longest Com­mon Palin­dromic Subsequence

    “The {em longest com­mon sub­se­quence (LCS)} prob­lem is a clas­sic and well-​​studied prob­lem in com­puter sci­ence. Palin­drome is a word which reads the same for­ward as it does back­ward. The {em longest com­mon palin­dromic sub­se­quence (LCPS)} prob­lem is an inter­est­ing vari­ant of the clas­sic LCS prob­lem which finds the longest com­mon sub­se­quence between two given strings such that the com­puted sub­se­quence is also a palin­drome. In this paper, we study the LCPS prob­lem and give effi­cient algo­rithms to solve this prob­lem. To the best of our knowl­edge, this is the first attempt to study and solve this inter­est­ing problem.”

    com­bi­na­torics strings algo­rithms nudge-​​targets

Items of some interest…

These are my recent Pin​board​.in links:

  • [1110.0585] Dis­crim­i­nately Decreas­ing Dis­crim­inabil­ity with Learned Image Filters

    “In machine learn­ing and com­puter vision, input images are often fil­tered to increase data dis­crim­inabil­ity. In some sit­u­a­tions, how­ever, one may wish to pur­posely decrease dis­crim­inabil­ity of one clas­si­fi­ca­tion task (a “dis­trac­tor” task), while simul­ta­ne­ously pre­serv­ing infor­ma­tion rel­e­vant to another (the task-​​of-​​interest): For exam­ple, it may be impor­tant to mask the iden­tity of per­sons con­tained in face images before sub­mit­ting them to a crowd­sourc­ing site (e.g., Mechan­i­cal Turk) when label­ing them for cer­tain facial attrib­utes. Another exam­ple is inter-​​dataset gen­er­al­iza­tion: when train­ing on a dataset with a par­tic­u­lar covari­ance struc­ture among mul­ti­ple attrib­utes, it may be use­ful to sup­press one attribute while pre­serv­ing another so that a trained clas­si­fier does not learn spu­ri­ous cor­re­la­tions between attrib­utes. In this paper we present an algo­rithm that finds opti­mal fil­ters to give high dis­crim­inabil­ity to one task while simul­ta­ne­ously giv­ing low dis­crim­inabil­ity to a dis­trac­tor task. We present results show­ing the effec­tive­ness of the pro­posed tech­nique on both sim­u­lated data and nat­ural face images.”

    machine-​​learning data-​​preparation fil­ter­ing algo­rithms nudge-​​targets
  • [1110.5183] Dif­fu­sion of Infor­ma­tion in Robot Swarms

    “This work is devoted to com­mu­ni­ca­tion approaches, which spread infor­ma­tion in robot swarms. These mech­a­nisms are use­ful for large-​​scale sys­tems and also for such cases when a lim­ited com­mu­ni­ca­tion equip­ment does not allow rout­ing of infor­ma­tion pack­ages. We focus on two approaches such as vir­tual fields and epi­demic algo­rithms, dis­cuss sev­eral aspects of hard­ware imple­men­ta­tion and demon­strate exper­i­ments per­formed with micro­ro­bots “Jasmine”.”

    agent-​​based swarms com­mu­ni­ca­tion complex-​​systems epi­demi­ol­ogy dynamical-​​systems exper­i­ment
  • [1109.5229] Dis­trib­uted Algo­rithms for Opti­mal Power Flow Problem

    “Opti­mal power flow (OPF) is an impor­tant prob­lem for power gen­er­a­tion and it is in gen­eral non-​​convex. With the employ­ment of renew­able energy, it will be desir­able if OPF can be solved very effi­ciently so its solu­tion can be used in real time. With some spe­cial net­work struc­ture, e.g. trees, the prob­lem has been shown to have a zero dual­ity gap and the con­vex dual prob­lem yields the opti­mal solu­tion. In this paper, we pro­pose a pri­mal and a dual algo­rithm to coor­di­nate the smaller sub­prob­lems decom­posed from the con­vex­i­fied OPF. We can arrange the sub­prob­lems to be solved sequen­tially and cumu­la­tively in a cen­tral node or solved in par­al­lel in dis­trib­uted nodes. We test the algo­rithms on IEEE radial dis­tri­b­u­tion test feed­ers, some ran­dom tree-​​structured net­works, and the IEEE trans­mis­sion sys­tem bench­marks. Sim­u­la­tion results show that the com­pu­ta­tion time can be improved dra­mat­i­cally with our algo­rithms over the cen­tral­ized approach of solv­ing the prob­lem with­out decom­po­si­tion, espe­cially in tree-​​structured prob­lems. The com­pu­ta­tion time grows lin­early with the prob­lem size with the cumu­la­tive approach while the dis­trib­uted one can have size-​​independent com­pu­ta­tion time.”

    operations-​​research algo­rithms network-​​theory infra­struc­ture com­po­si­tion nudge-​​targets
  • Coupon collector’s prob­lem — Wikipedia, the free encyclopedia

    “In prob­a­bil­ity the­ory, the coupon collector’s prob­lem describes the “col­lect all coupons and win” con­tests. It asks the fol­low­ing ques­tion: Sup­pose that there are n coupons, from which coupons are being col­lected with replace­ment. What is the prob­a­bil­ity that more than t sam­ple tri­als are needed to col­lect all n coupons? An alter­na­tive state­ment is: Given n coupons, how many coupons do you expect you need to draw with replace­ment before hav­ing drawn each coupon at least once. The math­e­mat­i­cal analy­sis of the prob­lem reveals that the expected num­ber of tri­als needed grows as Θ(nlog(n)). For exam­ple, when n = 50 it takes about 225 tri­als to col­lect all 50 coupons.”

    probability-​​theory slic-​​representation nudge-​​targets
  • [1107.2379] Data Sta­bil­ity in Clus­ter­ing: A Closer Look

    “This paper con­sid­ers the model intro­duced by Bilu and Linial (2010), who study prob­lems for which the opti­mal clus­ter­ing does not change when the dis­tances are per­turbed by mul­ti­plica­tive fac­tors. They show that even when a prob­lem is NP-​​hard, it is some­times pos­si­ble to obtain polynomial-​​time algo­rithms for instances resilient to large per­tur­ba­tions, e.g. on the order of $O(sqrt{n})$ for max-​​cut clus­ter­ing. Awasthi et al. (2010) extend this line of work by con­sid­er­ing center-​​based objec­tives, and Bal­can and Liang (2011) con­sider the $k$-median and min-​​sum objec­tives, giv­ing effi­cient algo­rithms for instances resilient to cer­tain con­stant mul­ti­plica­tive per­tur­ba­tions. Here, we are moti­vated by the ques­tion of to what extent these assump­tions can be relaxed while allow­ing for effi­cient algo­rithms. We show there is lit­tle room to improve these results by giv­ing NP-​​hardness lower bounds for both the $k$-median and min-​​sum objec­tives. On the other hand, we show that mul­ti­plica­tive resilience para­me­ters, even only on the order of $Theta(1)$, can be so strong as to make the clus­ter­ing prob­lem triv­ial, and we exploit these assump­tions to present a sim­ple one pass stream­ing algo­rithm for the $k$-median objec­tive. We also con­sider a model of addi­tive per­tur­ba­tions and give a cor­re­spon­dence between addi­tive and mul­ti­plica­tive notions of sta­bil­ity. Our results pro­vide a close exam­i­na­tion of the con­se­quences of assum­ing, even con­stant, sta­bil­ity in data.”

    clus­ter­ing sta­tis­tics algo­rithms robust­ness nudge-​​targets
  • [1109.5641] Energy-​​Aware Load Bal­anc­ing in Con­tent Deliv­ery Networks

    “Internet-​​scale dis­trib­uted sys­tems such as con­tent deliv­ery net­works (CDNs) oper­ate hun­dreds of thou­sands of servers deployed in thou­sands of data cen­ter loca­tions around the globe. Since the energy costs of oper­at­ing such a large IT infra­struc­ture are a sig­nif­i­cant frac­tion of the total oper­at­ing costs, we argue for redesign­ing CDNs to incor­po­rate energy opti­miza­tions as a first-​​order prin­ci­ple. We pro­pose tech­niques to turn off CDN servers dur­ing peri­ods of low load while seek­ing to bal­ance three key design goals: max­i­mize energy reduc­tion, min­i­mize the impact on client-​​perceived ser­vice avail­abil­ity (SLAs), and limit the fre­quency of on-​​off server tran­si­tions to reduce wear-​​and-​​tear and its impact on hard­ware reli­a­bil­ity. We pro­pose an opti­mal offline algo­rithm and an online algo­rithm to extract energy sav­ings both at the level of local load bal­anc­ing within a data cen­ter and global load bal­anc­ing across data cen­ters. We eval­u­ate our algo­rithms using real pro­duc­tion work­load traces from a large com­mer­cial CDN. Our results show that it is pos­si­ble to reduce the energy con­sump­tion of a CDN by more than 55% while ensur­ing a high level of avail­abil­ity that meets cus­tomer SLA require­ments and incur­ring an aver­age of one on-​​off tran­si­tion per server per day. Fur­ther, we show that keep­ing even 10% of the servers as hot spares helps absorb load spikes due to global flash crowds with lit­tle impact on avail­abil­ity SLAs. Finally, we show that redis­trib­ut­ing load across prox­i­mal data cen­ters can enhance ser­vice avail­abil­ity sig­nif­i­cantly, but has only a mod­est impact on energy savings.”

    algo­rithms load-​​balancing infra­struc­ture nudge-​​targets operations-​​research
  • [1110.0264] Face Recog­ni­tion using Opti­mal Rep­re­sen­ta­tion Ensemble

    “Recently, the face rec­og­niz­ers based on lin­ear rep­re­sen­ta­tions have been shown to deliver state-​​of-​​the-​​art per­for­mance. In real-​​world appli­ca­tions, how­ever, face images usu­ally suf­fer from expres­sions, dis­guises and ran­dom occlu­sions. The prob­lem­atic facial parts under­mine the valid­ity of the linear-​​subspace assump­tion and thus the recog­ni­tion per­for­mance dete­ri­o­rates sig­nif­i­cantly. In this work, we address the prob­lem in a learning-​​inference-​​mixed fash­ion. By observ­ing that the linear-​​subspace assump­tion is more reli­able on cer­tain face patches rather than on the holis­tic face, some Bayesian Patch Rep­re­sen­ta­tions (BPRs) are ran­domly gen­er­ated and inter­preted accord­ing to the Bayes’ the­ory. We then train an ensem­ble model over the patch-​​representations by min­i­miz­ing the empir­i­cal risk w.r.t the “leave-​​one-​​out mar­gins”. The obtained model is termed Opti­mal Rep­re­sen­ta­tion Ensem­ble (ORE), since it guar­an­tees the opti­mal­ity from the per­spec­tive of Empir­i­cal Risk Min­i­miza­tion. To han­dle the unknown pat­terns in test faces, a robust ver­sion of BPR is pro­posed by tak­ing the non-​​face cat­e­gory into con­sid­er­a­tion. Equipped with the Robust-​​BPRs, the infer­ence abil­ity of ORE is increased dra­mat­i­cally and sev­eral record-​​breaking accu­ra­cies (99.9% on Yale-​​B and 99.5% on AR) and desir­able effi­cien­cies (below 20 ms per face in Mat­lab) are achieved. It also over­whelms other mod­u­lar heuris­tics on the faces with ran­dom occlu­sions, extreme expres­sions and dis­guises. Fur­ther­more, to accom­mo­date immense BPRs sets, a boosting-​​like algo­rithm is also derived. The boosted model, a.k.a Boosted-​​ORE, obtains sim­i­lar per­for­mance to its pro­to­type. Besides the empir­i­cal supe­ri­or­i­ties, two desir­able fea­tures of the pro­posed meth­ods, namely, the training-​​determined model-​​selection and the data-​​weight-​​free boost­ing pro­ce­dure, are also the­o­ret­i­cally verified.”

    image-​​analysis face-​​recognition algo­rithms nudge-​​targets
  • [1110.0957] Dic­tio­nary Learn­ing for Deblur­ring and Dig­i­tal Zoom

    “This paper pro­poses a novel approach to image deblur­ring and dig­i­tal zoom­ing using sparse local mod­els of image appear­ance. These mod­els, where small image patches are rep­re­sented as lin­ear com­bi­na­tions of a few ele­ments drawn from some large set (dic­tio­nary) of can­di­dates, have proven well adapted to sev­eral image restora­tion tasks. A key to their suc­cess has been to learn dic­tio­nar­ies adapted to the recon­struc­tion of small image patches. In con­trast, recent works have pro­posed instead to learn dic­tio­nar­ies which are not only adapted to data recon­struc­tion, but also tuned for a spe­cific task. We intro­duce here such an approach to deblur­ring and dig­i­tal zoom, using pairs of blurry/​sharp (or low-​​/​high-​​resolution) images for train­ing, as well as an effec­tive sto­chas­tic gra­di­ent algo­rithm for solv­ing the cor­re­spond­ing opti­miza­tion task. Although this learn­ing prob­lem is not con­vex, once the dic­tio­nar­ies have been learned, the sharp/​high-​​resolution image can be recov­ered via con­vex opti­miza­tion at test time. Exper­i­ments with syn­thetic and real data demon­strate the effec­tive­ness of the pro­posed approach, lead­ing to state-​​of-​​the-​​art per­for­mance for non-​​blind image deblur­ring and dig­i­tal zoom.”

    image-​​processing image-​​analysis algo­rithms nudge-​​targets hyper­res­o­lu­tion
  • [1110.0477] Dis­trib­uted Evo­lu­tion­ary Graph Partitioning

    “We present a novel dis­trib­uted evo­lu­tion­ary algo­rithm, KaFF­PaE, to solve the Graph Par­ti­tion­ing Prob­lem, which makes use of KaFFPa (Karl­sruhe Fast Flow Par­ti­tioner). The use of our mul­ti­level graph par­ti­tioner KaFFPa pro­vides new effec­tive crossover and muta­tion oper­a­tors. By com­bin­ing these with a scal­able com­mu­ni­ca­tion pro­to­col we obtain a sys­tem that is able to improve the best known par­ti­tion­ing results for many inputs in a very short amount of time. For exam­ple, in Walshaw’s well known bench­mark tables we are able to improve or recom­pute 76% of entries for the tables with 1%, 3% and 5% imbalance.”

    algo­rithms graph-​​theory evolutionary-​​algorithms nudge-​​targets
  • [1104.1605] Effi­cient Top-​​K Retrieval in Online Social Tag­ging Networks

    “We con­sider in this paper top-​​k query answer­ing in social tag­ging sys­tems, also known as folk­sonomies. This prob­lem requires a sig­nif­i­cant depar­ture from exist­ing, socially agnos­tic tech­niques. In a network-​​aware con­text, one can (and should) exploit the social links, which can indi­cate how users relate to the seeker and how much weight their tag­ging actions should have in the result build-​​up. We pro­pose an algo­rithm that has the poten­tial to scale to cur­rent appli­ca­tions. While the prob­lem has already been con­sid­ered in pre­vi­ous lit­er­a­ture, this was done either under strong sim­pli­fy­ing assump­tions or under choices that can­not scale to even moderate-​​size real world appli­ca­tions. We first con­sider a key aspect of the prob­lem, which is access­ing the clos­est or most rel­e­vant users for a given seeker. We describe how this can be done on the fly (with­out any pre-​​computations) for sev­eral pos­si­ble choices — arguably the most nat­ural ones — of prox­im­ity com­pu­ta­tion in a user net­work. Based on this, our top-​​k algo­rithm is sound and com­plete, while address­ing the scal­a­bil­ity issues of the exist­ing ones. Impor­tantly, our tech­nique is instance opti­mal in the case when the search relies exclu­sively on the social weight of tag­ging actions. To fur­ther reduce response times, we then con­sider direc­tions for effi­ciency by approx­i­ma­tion. Exten­sive exper­i­ments on real world data show that our tech­niques can dras­ti­cally improve the response time, with­out sac­ri­fic­ing precision.”

    folk­son­omy tag­ging network-​​theory search-​​algorithms nudge-​​targets data-​​access

Items of some interest…

These are my recent Pin​board​.in links:

  • [1110.1462] Dynamic Clus­ter­ing of His­togram Data Based on Adap­tive Squared Wasser­stein Distances

    “…To clus­ter sets of his­togram data, we pro­pose to use Dynamic Clus­ter­ing Algo­rithm, (based on adap­tive squared Wasser­stein dis­tances) that is a k-​​means-​​like algo­rithm for clus­ter­ing a set of indi­vid­u­als into K classes that are apri­ori fixed. The main aim of this research is to pro­vide a tool for clus­ter­ing his­tograms, empha­siz­ing the dif­fer­ent con­tri­bu­tions of the his­togram vari­ables, and their com­po­nents, to the def­i­n­i­tion of the clus­ters. We demon­strate that this can be achieved using adap­tive dis­tances. Two kind of adap­tive dis­tances are con­sid­ered: the first takes into account the vari­abil­ity of each com­po­nent of each descrip­tor for the whole set of indi­vid­u­als; the sec­ond takes into account the vari­abil­ity of each com­po­nent of each descrip­tor in each clus­ter. We fur­nish inter­pre­ta­tive tools of the obtained par­ti­tion based on an exten­sion of the clas­si­cal mea­sures (indexes) to the use of adap­tive dis­tances in the clus­ter­ing cri­te­rion func­tion. Appli­ca­tions on syn­thetic and real-​​world data cor­rob­o­rate the pro­posed procedure.”

    clas­si­fi­ca­tion sta­tis­tics his­tograms met­rics clus­ter­ing
  • [1110.1412] Quan­ti­fy­ing loopy net­work architectures

    “Biol­ogy presents many exam­ples of pla­nar dis­tri­b­u­tion and struc­tural net­works hav­ing dense sets of closed loops. An arche­type of this form of net­work orga­ni­za­tion is the vas­cu­la­ture of dicotyle­do­nous leaves, which show­cases a hierarchically-​​nested archi­tec­ture con­tain­ing closed loops at many dif­fer­ent lev­els. Although a num­ber of meth­ods have been pro­posed to mea­sure aspects of the struc­ture of such net­works, a robust met­ric to quan­tify their hier­ar­chi­cal orga­ni­za­tion is still lack­ing. We present an algo­rith­mic frame­work, the hier­ar­chi­cal loop decom­po­si­tion, that allows map­ping loopy net­works to binary trees, pre­serv­ing in the con­nec­tiv­ity of the trees the archi­tec­ture of the orig­i­nal graph. We apply this frame­work to inves­ti­gate com­puter gen­er­ated graphs, such as arti­fi­cial mod­els and opti­mal dis­tri­b­u­tion net­works, as well as nat­ural graphs extracted from dig­i­tized images of dicotyle­do­nous leaves and vas­cu­la­ture of rat cere­bral neo­cor­tex. We cal­cu­late var­i­ous met­rics based on the Asym­me­try, the cumu­la­tive size dis­tri­b­u­tion and the Strahler bifur­ca­tion ratios of the cor­re­spond­ing trees and dis­cuss the rela­tion­ship of these quan­ti­ties to the archi­tec­tural orga­ni­za­tion of the orig­i­nal graphs. This algo­rith­mic frame­work decou­ples the geo­met­ric infor­ma­tion (exact loca­tion of edges and nodes) from the met­ric topol­ogy (con­nec­tiv­ity and edge weight) and it ulti­mately allows us to per­form a quan­ti­ta­tive sta­tis­ti­cal com­par­i­son between pre­dic­tions of the­o­ret­i­cal mod­els and nat­u­rally occur­ring loopy graphs.”

    com­plex­ol­ogy bio­physics network-​​theory met­rics
  • [1110.1393] High-​​Precision Tun­ing of State for Mem­ris­tive Devices by Adapt­able Variation-​​Tolerant Algorithm

    “Using mem­ris­tive prop­er­ties com­mon for the tita­nium diox­ide thin film devices, we designed a sim­ple write algo­rithm to tune device con­duc­tance at a spe­cific bias point to 1% rel­a­tive accu­racy (which is roughly equiv­a­lent to 7-​​bit pre­ci­sion) within its dynamic range even in the pres­ence of large vari­a­tions in switch­ing behav­ior. The high pre­ci­sion state is non­volatile and the results are likely to be sus­tained for nanoscale mem­ris­tive devices because of the inher­ent fil­a­men­tary nature of the resis­tive switch­ing. The pro­posed func­tion­al­ity of mem­ris­tive devices is espe­cially attrac­tive for ana­log com­put­ing with low pre­ci­sion data. As one rep­re­sen­ta­tive exam­ple we demon­strate hybrid cir­cuitry con­sist­ing of CMOS sum­ming ampli­fier and two mem­ris­tive devices to per­form ana­log mul­ti­ply and accu­mu­late com­pu­ta­tion, which is a typ­i­cal bot­tle­neck oper­a­tion in infor­ma­tion processing.”

    mem­ris­tors engineering-​​design sim­u­la­tion control-​​systems nudge-​​targets
  • [1110.1521] Nodal domains of a non-​​separable prob­lem — the right angled isosce­les triangle

    “Our result may be gen­er­al­ized to other domains where sim­i­lar algo­rithms may apply. Our algo­rithm is based on the fact that the eigen­func­tions are pre­sented as a lin­ear com­bi­na­tion of sim­ple plane waves. It is there­fore tempt­ing to try and gen­er­al­ize it for other drums with sim­i­lar prop­erty. The equi­lat­eral tri­an­gle is an imme­di­ate can­di­date (see [29] and ref­er­ences within). A fur­ther, and quite sur­pris­ing, result is the recur­sive for­mula for the num­ber of nodal loops. To our knowl­edge this is the first known exact for­mula for the nodal count of a non-​​separable pla­nar man­i­fold (for cer­tain eigen­func­tions of tori exact for­mu­las have been given in [22]). The for­mula was found by direct inspec­tion of large tables and has been ver­i­fied for a large bulk of data com­pu­ta­tion­ally. An obvi­ous chal­lenge is to prove this for­mula. In par­tic­u­lar, the recur­sive part of the for­mula resem­bles the famous Euclid algo­rithm for the great­est com­mon divi­sor. A fur­ther inves­ti­ga­tion of the men­tioned for­mula might there­fore expose some new num­ber the­o­ret­i­cal prop­er­ties of the nodal count.”

    physics algo­rithms analytical-​​results open-​​questions geom­e­try acoustics exact-​​form nudge-​​targets
  • [1110.1485] A Face Recog­ni­tion Scheme using Wavelet Based Dom­i­nant Features

    “In this paper, a multi-​​resolution fea­ture extrac­tion algo­rithm for face recog­ni­tion is pro­posed based on two-​​dimensional dis­crete wavelet trans­form (2D-​​DWT), which effi­ciently exploits the local spa­tial vari­a­tions in a face image. For the pur­pose of fea­ture extrac­tion, instead of con­sid­er­ing the entire face image, an entropy-​​based local band selec­tion cri­te­rion is devel­oped, which selects high-​​informative hor­i­zon­tal seg­ments from the face image. In order to cap­ture the local spa­tial vari­a­tions within these high­in­for­ma­tive hor­i­zon­tal bands pre­cisely, the hor­i­zon­tal band is seg­mented into sev­eral small spa­tial mod­ules. Dom­i­nant wavelet coef­fi­cients cor­re­spond­ing to each local region resid­ing inside those hor­i­zon­tal bands are selected as fea­tures. In the selec­tion of the dom­i­nant coef­fi­cients, a thresh­old cri­te­rion is pro­posed, which not only dras­ti­cally reduces the fea­ture dimen­sion but also pro­vides high within-​​class com­pact­ness and high between-​​class sep­a­ra­bil­ity. A prin­ci­pal com­po­nent analy­sis is per­formed to fur­ther reduce the dimen­sion­al­ity of the fea­ture space. Exten­sive exper­i­men­ta­tion is car­ried out upon stan­dard face data­bases and a very high degree of recog­ni­tion accu­racy is achieved by the pro­posed method in com­par­i­son to those obtained by some of the exist­ing methods.”

    face-​​recognition algo­rithms image-​​processing wavelets nudge-​​targets
  • [1110.1553] Hier­ar­chi­cal QR fac­tor­iza­tion algo­rithms for multi-​​core clus­ter systems

    “This paper describes a new QR fac­tor­iza­tion algo­rithm which is espe­cially designed for mas­sively par­al­lel plat­forms com­bin­ing par­al­lel dis­trib­uted multi-​​core nodes. These plat­forms make the present and the fore­see­able future of high-​​performance com­put­ing. Our new QR fac­tor­iza­tion algo­rithm falls in the cat­e­gory of the tile algo­rithms which nat­u­rally enables good data local­ity for the sequen­tial ker­nels exe­cuted by the cores (high sequen­tial per­for­mance), low num­ber of mes­sages in a par­al­lel dis­trib­uted set­ting (small latency term), and fine gran­u­lar­ity (high parallelism).”

    parallel-​​computing operations-​​research fac­tor­iza­tion algo­rithms nudge-​​targets meta-​​algorithms
  • [1110.1560] On the Col­or­ing of Grid Wire­less Sen­sor Net­works: the Vector-​​Based Col­or­ing Method

    “Graph col­or­ing is used in wire­less net­works to opti­mize net­work resources: band­width and energy. Nodes access the medium accord­ing to their color. It is the respon­si­bil­ity of the col­or­ing algo­rithm to ensure that inter­fer­ing nodes do not have the same color. In this research report, we focus on wire­less sen­sor net­works with grid topolo­gies. How does a col­or­ing algo­rithm take advan­tage of the reg­u­lar­ity of grid topol­ogy to pro­vide an opti­mal peri­odic col­or­ing, that is a col­or­ing with the min­i­mum num­ber of col­ors? We pro­pose the Vector-​​Based Col­or­ing Method, denoted VCM, a new method that is able to pro­vide an opti­mal peri­odic col­or­ing for any radio trans­mis­sion range and for any h-​​hop col­or­ing, h>=1. This method con­sists in deter­min­ing at which grid nodes a color can be repro­duced with­out cre­at­ing inter­fer­ences between these nodes while min­i­miz­ing the num­ber of col­ors used. We com­pare the num­ber of col­ors pro­vided by VCM with the num­ber of col­ors obtained by a dis­trib­uted col­or­ing algo­rithm with line and col­umn pri­or­ity assign­ments. We also pro­vide bounds on the num­ber of col­ors of opti­mal gen­eral col­or­ings of the infi­nite grid, and show that peri­odic col­or­ings (and thus VCM) are asymp­tot­i­cally opti­mal. Finally, we dis­cuss the applic­a­bil­ity of this method to a real wire­less network.”

    graph-​​theory algo­rithms operations-​​research nudge-​​targets
  • [1110.1580] A Polylogarithmic-​​Competitive Algo­rithm for the k-​​Server Problem

    “We give the first polylogarithmic-​​competitive ran­dom­ized online algo­rithm for the $k$-server prob­lem on an arbi­trary finite met­ric space. In par­tic­u­lar, our algo­rithm achieves a com­pet­i­tive ratio of O(log^3 n log^2 k log log n) for any met­ric space on n points. Our algo­rithm improves upon the deter­min­is­tic (2k-1)-competitive algo­rithm of Kout­sou­pias and Papadim­itriou [J.ACM’95] when­ever n is sub-​​exponential in k.”

    sched­ul­ing operations-​​research algo­rithms nudge-​​targets
  • [1110.1590] PSA: The Packet Sched­ul­ing Algo­rithm for Wire­less Sen­sor Networks

    “The main cause of wasted energy con­sump­tion in wire­less sen­sor net­works is packet col­li­sion. The packet sched­ul­ing algo­rithm is there­fore intro­duced to solve this prob­lem. Some packet sched­ul­ing algo­rithms can also influ­ence and delay the data trans­mit­ting in the real-​​time wire­less sen­sor net­works. This paper presents the packet sched­ul­ing algo­rithm (PSA) in order to reduce the packet con­ges­tion in MAC layer lead­ing to reduce the over­all of packet col­li­sion in the sys­tem The PSA is com­pared with the sim­ple CSMA/​CA and other approaches using net­work topol­ogy bench­marks in math­e­mat­i­cal method. The per­for­mances of our PSA are bet­ter than the stan­dard (CSMA/​CA). The PSA pro­duces bet­ter through­put than other algo­rithms. On other hand, the aver­age delay of PSA is higher than pre­vi­ous works. How­ever, the PSA uti­lizes the chan­nel bet­ter than all algorithms.”

    sensor-​​networks distributed-​​processing sched­ul­ing rout­ing operations-​​research algo­rithms nudge-​​targets
  • [1110.0725] A Sur­vey of Dis­trib­uted Data Aggre­ga­tion Algorithms

    “Dis­trib­uted data aggre­ga­tion has been an active field of research in the last decade, and a huge diverse amount of tech­niques can be found in the lit­er­a­ture. For this rea­sons, this sur­vey intends to be an impor­tant time sav­ing instru­ment, for those that desire to get a quick and com­pre­hen­sive overview of the state of the art on dis­trib­uted data aggre­ga­tion. More­over, by care­fully high­light­ing the strength and lim­i­ta­tions of the more per­ti­nent approaches, this study can pro­vide a use­ful assis­tance to help read­ers choose which tech­nique to apply in spe­cific set­tings. Cur­rently, there is no ideal gen­eral solu­tion to the dis­trib­uted com­pu­ta­tion of an aggre­ga­tion func­tion, all exist­ing tech­niques have its pit­falls (some more than oth­ers). There­fore, more research in this field will be expected in the next few years. In par­tic­u­lar, due to the added value of com­put­ing com­plex aggre­gates, new algo­rithms might arise to esti­mate the sta­tis­ti­cal dis­tri­b­u­tion of val­ues, as the few exist­ing approaches exhibit some lim­i­ta­tions in terms of accu­racy and resource con­sump­tion. Addi­tional research efforts should be made to improve the sup­port to churn, mes­sage loss, and con­tin­u­ous esti­ma­tion of muta­ble input values.”

    sta­tis­tics reviews distributed-​​processing com­mu­ni­ca­tion coor­di­na­tion nudge-​​targets
  • [0911.3482] Com­plex­ity of Net­works (reprise)

    “Net­work or graph struc­tures are ubiq­ui­tous in the study of com­plex sys­tems. Often, we are inter­ested in com­plex­ity trends of these sys­tem as it evolves under some dynamic. An exam­ple might be look­ing at the com­plex­ity of a food web as species enter an ecosys­tem via migra­tion or spe­ci­a­tion, and leave via extinc­tion. In a pre­vi­ous paper, a com­plex­ity mea­sure of net­works was pro­posed based on the {em com­plex­ity is infor­ma­tion con­tent} par­a­digm. To apply this par­a­digm to any object, one must fix two things: a rep­re­sen­ta­tion lan­guage, in which strings of sym­bols from some alpha­bet describe, or stand for the objects being con­sid­ered; and a means of deter­min­ing when two such descrip­tions refer to the same object. With these two things set, the infor­ma­tion con­tent of an object can be com­puted in prin­ci­ple from the num­ber of equiv­a­lent descrip­tions describ­ing a par­tic­u­lar object. The pre­vi­ously pro­posed rep­re­sen­ta­tion lan­guage had the defi­ciency that the fully con­nected and empty net­works were the most com­plex for a given num­ber of nodes. A vari­a­tion of this mea­sure, called zcom­plex­ity, applied a com­pres­sion algo­rithm to the result­ing bit­string rep­re­sen­ta­tion, to solve this prob­lem. Unfor­tu­nately, zcom­plex­ity proved too com­pu­ta­tion­ally expen­sive to be prac­ti­cal. In this paper, I pro­pose a new rep­re­sen­ta­tion lan­guage that encodes the num­ber of links along with the num­ber of nodes and a rep­re­sen­ta­tion of the lin­klist. This, like zcom­plex­ity, exhibits min­i­mal com­plex­ity for fully con­nected and empty net­works, but is as tractable as the orig­i­nal measure.”

    network-​​theory com­plex­ol­ogy complex-​​systems mea­sure­ment per­form structure-​​function-​​relations discrete-​​mathematics
  • [1108.4279] Detec­tion and emergence

    “Two dif­fer­ent con­cep­tions of emer­gence are rec­on­ciled as two instances of the phe­nom­e­non of detec­tion. In the process of com­par­ing these two con­cep­tions, we find that the notions of com­plex­ity and detec­tion allow us to form a uni­fied def­i­n­i­tion of emer­gence that clearly delin­eates the role of the observer.”

    com­plex­ol­ogy emer­gence pragmatism-it-ain’t but-​​soon
  • [1001.4278] Weight Opti­miza­tion for Dis­trib­uted Aver­age Con­sen­sus Algo­rithm in Sym­met­ric, CCS & KCS Star Networks

    “This paper addresses weight opti­miza­tion prob­lem in dis­trib­uted con­sen­sus aver­ag­ing algo­rithm over net­works with sym­met­ric star topol­ogy. We have deter­mined opti­mal weights and con­ver­gence rate of the net­work in terms of its topo­log­i­cal para­me­ters. In addi­tion, two alter­na­tive topolo­gies with more rapid con­ver­gence rates have been intro­duced. The new topolo­gies are Complete-​​Cored Sym­met­ric (CCS) star and K-​​Cored Sym­met­ric (KCS) star topolo­gies. It has been shown that the opti­mal weights for the edges of cen­tral part in sym­met­ric and CCS star con­fig­u­ra­tions are inde­pen­dent of their branches. By sim­u­la­tion opti­mal­ity of obtained weights under quan­ti­za­tion con­straints have been verified.”

    operations-​​research decision-​​making network-​​theory nudge-​​targets
  • [1109.5389] Water dri­ves pep­tide con­for­ma­tional transitions

    “Tran­si­tions between metastable con­for­ma­tions of a dipep­tide are inves­ti­gated using clas­si­cal mol­e­c­u­lar dynam­ics sim­u­la­tion with explicit water mol­e­cules. The dis­tri­b­u­tion of the sur­round­ing water at dif­fer­ent moments before the tran­si­tions and the dynam­i­cal cor­re­la­tions of water with the peptide’s con­fig­u­ra­tional motions indi­cate that water is the main dri­ving force of the con­for­ma­tional changes.”

    molecular-​​design systems-​​biology sim­u­la­tion intracellular-​​dynamics kinda-​​knew-​​this-​​a-​​long-​​time-​​ago bio­chem­istry
  • [1105.1445] Vehic­u­lar traf­fic flow at an inter­sec­tion with the pos­si­bil­ity of turning

    “We have devel­oped a Nagel-​​Schreckenberg cel­lu­lar automata model for describ­ing of vehic­u­lar traf­fic flow at a sin­gle inter­sec­tion. A set of traf­fic lights oper­at­ing in fixed-​​time scheme con­trols the traf­fic flow. Open bound­ary con­di­tion is applied to the streets each of which con­duct a uni-​​directional flow. Streets are single-​​lane and cars can turn upon reach­ing to the inter­sec­tion with pre­scribed prob­a­bil­i­ties. Exten­sive Monte Carlo sim­u­la­tions are car­ried out to find the model flow char­ac­ter­is­tics. In par­tic­u­lar, we inves­ti­gate the flows depen­dence on the sig­nal­i­sa­tion para­me­ters, turn­ing prob­a­bil­i­ties and input rates. It is shown that for each set of para­me­ters, there exist a plateau region inside which the total out­flow from the inter­sec­tion remains almost con­stant. We also com­pute total wait­ing time of vehi­cles per cycle behind red lights for var­i­ous con­trol parameters.”

    cellular-​​automata com­plex­ol­ogy traffic-​​models agent-​​based sim­u­la­tion nudge-​​substrates
  • [1110.1391] A Com­par­i­son of Dif­fer­ent Machine Translit­er­a­tion Models

    “Machine translit­er­a­tion is a method for auto­mat­i­cally con­vert­ing words in one lan­guage into pho­net­i­cally equiv­a­lent ones in another lan­guage. Machine translit­er­a­tion plays an impor­tant role in nat­ural lan­guage appli­ca­tions such as infor­ma­tion retrieval and machine trans­la­tion, espe­cially for han­dling proper nouns and tech­ni­cal terms. Four machine translit­er­a­tion mod­els — grapheme-​​based translit­er­a­tion model, phoneme-​​based translit­er­a­tion model, hybrid translit­er­a­tion model, and correspondence-​​based translit­er­a­tion model — have been pro­posed by sev­eral researchers. To date, how­ever, there has been lit­tle research on a frame­work in which mul­ti­ple translit­er­a­tion mod­els can oper­ate simul­ta­ne­ously. Fur­ther­more, there has been no com­par­i­son of the four mod­els within the same frame­work and using the same data. We addressed these prob­lems by 1) mod­el­ing the four mod­els within the same frame­work, 2) com­par­ing them under the same con­di­tions, and 3) devel­op­ing a way to improve machine translit­er­a­tion through this com­par­i­son. Our com­par­i­son showed that the hybrid and correspondence-​​based mod­els were the most effec­tive and that the four mod­els can be used in a com­ple­men­tary man­ner to improve machine translit­er­a­tion performance.”

    natural-​​language-​​processing machine-​​learning review nudge-​​targets
  • [1108.5508] A Pat­tern Measure

    “In this paper we pro­pose numer­i­cal mea­sures for eval­u­at­ing the aes­thetic inter­est of sim­ple pat­terns. The pat­terns con­sist of ele­ments (sym­bols, pix­els, etc.) in reg­u­lar square arrays. The mea­sures depend on two char­ac­ter­is­tics of the pat­terns: the num­ber of dif­fer­ent types of ele­ment, and the num­ber of sym­me­tries in their arrange­ment. We define two com­ple­men­tary com­pos­ite mea­sures L and C for the degree of pat­tern in a design, and com­pute them here for 2×2 and 6×6 arrays. The results dis­tin­guish sim­ple from high-​​variation cases. We sus­pect that the mea­sure L cor­re­sponds to the degree that human beings intu­itively feel a design to be “inter­est­ing”, so this model would aid in quan­ti­fy­ing the visual con­nec­tion of two– dimen­sional designs with view­ers. The other com­pos­ite mea­sure C based on these numer­i­cal prop­er­ties char­ac­ter­izes the extent of ran­dom­ness of an array. Com­bin­ing sym­bol vari­ety with sym­me­try cal­cu­la­tions allows us to employ hier­ar­chi­cal scal­ing to count the rel­a­tive impact of dif­fer­ent lev­els of scale. By iden­ti­fy­ing sub­struc­tures we can dis­tin­guish between orga­nized pat­terns and dis­or­ga­nized com­plex­ity. The mea­sures described here are related to ver­bal descrip­tors derived from work by psy­chol­o­gists on responses to visual environments.”

    cog­ni­tion aes­thet­ics experimental-​​psychology nudge-​​targets learning-​​by-​​watching
  • [1106.5264] Acquir­ing Cor­rect Knowl­edge for Nat­ural Lan­guage Generation

    “Nat­ural lan­guage gen­er­a­tion (NLG) sys­tems are com­puter soft­ware sys­tems that pro­duce texts in Eng­lish and other human lan­guages, often from non-​​linguistic input data. NLG sys­tems, like most AI sys­tems, need sub­stan­tial amounts of knowl­edge. How­ever, our expe­ri­ence in two NLG projects sug­gests that it is dif­fi­cult to acquire cor­rect knowl­edge for NLG sys­tems; indeed, every knowl­edge acqui­si­tion (KA) tech­nique we tried had sig­nif­i­cant prob­lems. In gen­eral terms, these prob­lems were due to the com­plex­ity, nov­elty, and poorly under­stood nature of the tasks our sys­tems attempted, and were wors­ened by the fact that peo­ple write so dif­fer­ently. This meant in par­tic­u­lar that corpus-​​based KA approaches suf­fered because it was impos­si­ble to assem­ble a siz­able cor­pus of high-​​quality con­sis­tent man­u­ally writ­ten texts in our domains; and struc­tured expert-​​oriented KA tech­niques suf­fered because experts dis­agreed and because we could not get enough infor­ma­tion about spe­cial and unusual cases to build robust sys­tems. We believe that such prob­lems are likely to affect many other NLG sys­tems as well. In the long term, we hope that new KA tech­niques may emerge to help NLG sys­tem builders. In the shorter term, we believe that under­stand­ing how indi­vid­ual KA tech­niques can fail, and using a mix­ture of dif­fer­ent KA tech­niques with dif­fer­ent strengths and weak­nesses, can help devel­op­ers acquire NLG knowl­edge that is mostly correct.”

    natural-​​language-​​processing artificial-​​intelligence interesting-​​problems high-​​hanging-​​fruit machine-​​learning nudge-​​targets
  • [1105.2423] Cytoskele­ton and Cell Motility

    “The present arti­cle is an invited con­tri­bu­tion to the Ency­clo­pe­dia of Com­plex­ity and Sys­tem Sci­ence, Robert A. Mey­ers Ed., Springer New York (2009). It is a review of the bio­phys­i­cal mech­a­nisms that underly cell motility.…”

    bio­physics biol­ogy review i-​​used-​​to-​​do-​​this-​​stuff lovely
  • [1110.0671] Width Dis­tri­b­u­tions for Con­vex Reg­u­lar Polyhedra

    “The mean width is a mea­sure on three-​​dimensional con­vex bod­ies that enjoys equal sta­tus with vol­ume and sur­face area [Rota]. As the phrase sug­gests, it is the mean of a prob­a­bil­ity den­sity f. We ver­ify for­mu­las for mean widths of the reg­u­lar tetra­he­dron and the cube. Higher-​​order moments of f_​tetra and f_​cube have not been exam­ined until now. Assume that each poly­he­dron has edges of unit length. We deduce that the mean square width of the reg­u­lar tetra­he­dron is 1/3+(3+sqrt(3))/(3*pi) and the mean square width of the cube is 1+4/pi.”

    geom­e­try mathematical-​​recreations nudge-​​targets
  • [cs/​0305036] Using Dynamic Sim­u­la­tion in the Devel­op­ment of Con­struc­tion Machinery

    “As in the car indus­try for quite some time, dynamic sim­u­la­tion of com­plete vehi­cles is being prac­ticed more and more in the devel­op­ment of off-​​road machin­ery. How­ever, spe­cific ques­tions arise due not only to com­pany struc­ture and size, but espe­cially to the type of prod­uct. Tightly cou­pled, non-​​linear sub­sys­tems of dif­fer­ent domains make pre­dic­tion and opti­mi­sa­tion of the com­plete system’s dynamic behav­iour a chal­lenge. Fur­ther­more, the demand for ver­sa­tile machines leads to some­times con­tra­dic­tory tar­get require­ments and can turn the design process into a hunt for the least painful com­pro­mise. This can be avoided by pro­found sys­tem knowl­edge, assisted by simulation-​​driven prod­uct devel­op­ment. This paper gives an overview of joint research into this issue by Volvo Wheel Load­ers and Linkop­ing Uni­ver­sity on that mat­ter, lists the results of a related lit­er­a­ture review and intro­duces the term “oper­ate­abil­ity”. Rather than giv­ing detailed answers, the prob­lem space for ongo­ing and future research is exam­ined and pos­si­ble solu­tions are sketched.”

    engineering-​​design design-​​automation mod­el­ing dynamical-​​systems man­u­fac­tur­ing nudge-​​targets