Items of some interest…

These are my recent Pin​board​.in links:

  • [1101.4003] Dyna-​​H: a heuris­tic plan­ning rein­force­ment learn­ing algo­rithm applied to role-​​playing-​​game strat­egy deci­sion systems

    “In a Role-​​Playing Game, find­ing opti­mal tra­jec­to­ries is one of the most impor­tant tasks. In fact, the strat­egy deci­sion sys­tem becomes a key com­po­nent of a game engine. Deter­min­ing the way in which deci­sions are taken (online, batch or sim­u­lated) and the con­sumed resources in deci­sion mak­ing (e.g. exe­cu­tion time, mem­ory) will influ­ence, in mayor degree, the game per­for­mance. When clas­si­cal search algo­rithms such as A* can be used, they are the very first option. Nev­er­the­less, such meth­ods rely on pre­cise and com­plete mod­els of the search space, and there are many inter­est­ing sce­nar­ios where their appli­ca­tion is not pos­si­ble. Then, model free meth­ods for sequen­tial deci­sion mak­ing under uncer­tainty are the best choice. In this paper, we pro­pose a heuris­tic plan­ning strat­egy to incor­po­rate the abil­ity of heuristic-​​search in path-​​finding into a Dyna agent. The pro­posed Dyna-​​H algo­rithm, as A* does, selects branches more likely to pro­duce out­comes than other branches. Besides, it has the advan­tages of being a model-​​free online rein­force­ment learn­ing algo­rithm. The pro­posal was eval­u­ated against the one-​​step Q-​​Learning and Dyna-​​Q algo­rithms obtain­ing excel­lent exper­i­men­tal results: Dyna-​​H sig­nif­i­cantly over­comes both meth­ods in all exper­i­ments. We sug­gest also, a func­tional anal­ogy between the pro­posed sam­pling from worst tra­jec­to­ries heuris­tic and the role of dreams (e.g. night­mares) in human behavior.”

    plan­ning machine-​​learning nudge-​​targets easy-​​pickins
  • [0908.3565] Dis­trib­uted Loca­tion Opti­miza­tion for Sen­sors with Lim­ited Range Het­ero­ge­neous Capa­bil­i­ties using Gen­er­al­ized Voronoi Partition

    “In this paper a gen­er­al­iza­tion of the Voronoi par­ti­tion is used for solv­ing a het­ero­ge­neous dis­trib­uted loca­tional opti­miza­tion prob­lem for autonomous agents, such as AGVs, UAVs, etc. The prob­lem addressed is of opti­mal deploy­ment of agents equipped with sen­sors, hav­ing het­ero­ge­neous capa­bil­i­ties, and lim­ited range, to max­i­mize sen­sor cov­er­age. An objec­tive func­tion for opti­mal deploy­ment of agents is for­mu­lated, and its crit­i­cal points are deter­mined. The opti­mal deploy­ment is shown to be the gen­er­al­ized cen­troidal Voronoi con­fig­u­ra­tion in which the agents are located at the cen­troids of the cor­re­spond­ing gen­er­al­ized Voronoi cells. For­mal results on sta­bil­ity, con­ver­gence, and on spa­tial dis­tri­b­u­tion of the pro­posed con­trol laws respon­si­ble for agent motion, under some con­straints on the agents’ speeds and limit on sen­sor range are pro­vided. The the­o­ret­i­cal results are sup­ported with illus­tra­tive simulation”

    agent-​​based coor­di­na­tion sensor-​​networks nudge-​​targets emergent-​​design
  • [1106.6058] Sta­bil­ity of strate­gies in payoff-​​driven evo­lu­tion­ary games on networks

    “We con­sider a net­work of cou­pled agents play­ing the Prisoner’s Dilemma game, in which play­ers are allowed to pick a strat­egy in the inter­val [0,1], with 0 cor­re­spond­ing to defec­tion, 1 to coop­er­a­tion, and inter­me­di­ate val­ues rep­re­sent­ing mixed strate­gies in which each player may act as a coop­er­a­tor or a defec­tor over a large num­ber of inter­ac­tions with a cer­tain prob­a­bil­ity. Our model is payoff-​​driven, i.e., we assume that the level of accu­mu­lated pay­off at each node is a rel­e­vant para­me­ter in the selec­tion of strate­gies. Also, we con­sider that each player chooses his/​her strat­egy in a con­text of lim­ited infor­ma­tion. We present a deter­min­is­tic non­lin­ear model for the evo­lu­tion of strate­gies. We show that the final strate­gies depend on the net­work struc­ture and on the choice of the para­me­ters of the game. We find that polar­ized strate­gies (pure cooperator/​defector states) typ­i­cally emerge when (i) the net­work con­nec­tions are sparse, (ii) the net­work degree dis­tri­b­u­tion is het­ero­ge­neous, (iii) the net­work is assor­ta­tive, and sur­pris­ingly, (iv) the ben­e­fit of coop­er­a­tion is high.”

    prisoners’-dilemma agent-​​based network-​​theory artificial-​​life com­plex­ol­ogy nudge-​​targets
  • [1106.0296] The Emer­gence of Lead­er­ship in Social Networks

    “We study a net­worked ver­sion of the minor­ity game in which agents can choose to fol­low the choices made by a neigh­bour­ing agent in a social net­work. We show that for a wide vari­ety of net­works a lead­er­ship struc­ture always emerges, with most agents fol­low­ing the choice made by a few agents. We find a suit­able para­me­ter­i­sa­tion which high­lights the uni­ver­sal aspects of the behav­iour and which also indi­cates where results depend on the type of social network.”

    minority-​​game social-​​networks soci­ol­ogy agent-​​based network-​​theory
  • [1106.1816] Mon­i­tor­ing Teams by Over­hear­ing: A Multi-​​Agent Plan-​​Recognition Approach

    “Recent years are see­ing an increas­ing need for on-​​line mon­i­tor­ing of teams of coop­er­at­ing agents, e.g., for visu­al­iza­tion, or per­for­mance track­ing. How­ever, in mon­i­tor­ing deployed teams, we often can­not rely on the agents to always com­mu­ni­cate their state to the mon­i­tor­ing sys­tem. This paper presents a non-​​intrusive approach to mon­i­tor­ing by ‘over­hear­ing’, where the mon­i­tored team’s state is inferred (via plan-​​recognition) from team-​​members’ rou­tine com­mu­ni­ca­tions, exchanged as part of their coor­di­nated task exe­cu­tion, and observed (over­heard) by the mon­i­tor­ing sys­tem. Key chal­lenges in this approach include the demand­ing run-​​time require­ments of mon­i­tor­ing, the scarce­ness of obser­va­tions (increas­ing mon­i­tor­ing uncer­tainty), and the need to scale-​​up mon­i­tor­ing to address poten­tially large teams. To address these, we present a set of com­ple­men­tary novel tech­niques, exploit­ing knowl­edge of the social struc­tures and pro­ce­dures in the mon­i­tored team: (i) an effi­cient prob­a­bilis­tic plan-​​recognition algo­rithm, well-​​suited for pro­cess­ing com­mu­ni­ca­tions as obser­va­tions; (ii) an approach to exploit­ing knowl­edge of the team’s social behav­ior to pre­dict future obser­va­tions dur­ing exe­cu­tion (reduc­ing mon­i­tor­ing uncer­tainty); and (iii) mon­i­tor­ing algo­rithms that trade expres­siv­ity for scal­a­bil­ity, rep­re­sent­ing only cer­tain use­ful mon­i­tor­ing hypothe­ses, but allow­ing for any num­ber of agents and their dif­fer­ent activ­i­ties to be rep­re­sented in a sin­gle coher­ent entity. We present an empir­i­cal eval­u­a­tion of these tech­niques, in com­bi­na­tion and apart, in mon­i­tor­ing a deployed team of agents, run­ning on machines phys­i­cally dis­trib­uted across the coun­try, and engaged in com­plex, dynamic task exe­cu­tion. We also com­pare the per­for­mance of these tech­niques to human expert and novice mon­i­tors, and show that the tech­niques pre­sented are capa­ble of mon­i­tor­ing at human-​​expert lev­els, despite the dif­fi­culty of the task.”

    emergent-​​design agent-​​based swarms coor­di­na­tion nudge
  • [1011.2861] A Com­pre­hen­sive Work­flow for General-​​Purpose Neural Mod­el­ing with Highly Con­fig­urable Neu­ro­mor­phic Hard­ware Systems

    “In this paper we present a method­olog­i­cal frame­work that meets novel require­ments emerg­ing from upcom­ing types of accel­er­ated and highly con­fig­urable neu­ro­mor­phic hard­ware sys­tems. We describe in detail a device with 45 mil­lion pro­gram­ma­ble and dynamic synapses that is cur­rently under devel­op­ment, and we sketch the con­cep­tual chal­lenges that arise from tak­ing this plat­form into oper­a­tion. More specif­i­cally, we aim at the estab­lish­ment of this neu­ro­mor­phic sys­tem as a flex­i­ble and neu­ro­sci­en­tif­i­cally valu­able mod­el­ing tool that can be used by non-​​hardware-​​experts. We con­sider var­i­ous func­tional aspects to be cru­cial for this pur­pose, and we intro­duce a con­sis­tent work­flow with detailed descrip­tions of all involved mod­ules that imple­ment the sug­gested steps: The inte­gra­tion of the hard­ware inter­face into the simulator-​​independent model descrip­tion lan­guage PyNN; a fully auto­mated trans­la­tion between the PyNN domain and appro­pri­ate hard­ware con­fig­u­ra­tions; an exe­cutable spec­i­fi­ca­tion of the future neu­ro­mor­phic sys­tem that can be seam­lessly inte­grated into this biology-​​to-​​hardware map­ping process as a test bench for all soft­ware lay­ers and pos­si­ble hard­ware design mod­i­fi­ca­tions; an eval­u­a­tion scheme that deploys mod­els from a ded­i­cated bench­mark library, com­pares the results gen­er­ated by vir­tual or pro­to­type hard­ware devices with ref­er­ence soft­ware sim­u­la­tions and ana­lyzes the dif­fer­ences. The inte­gra­tion of these com­po­nents into one hardware-​​software work­flow pro­vides an ecosys­tem for ongo­ing prepar­a­tive stud­ies that sup­port the hard­ware design process and rep­re­sents the basis for the matu­rity of the model-​​to-​​hardware map­ping soft­ware. The func­tion­al­ity and flex­i­bil­ity of the lat­ter is proven with a vari­ety of exper­i­men­tal results.”

    neural-​​networks biologically-​​inspired elec­tron­ics emergent-​​design nudge-​​targets

Comments are closed.