These are my recent Pinboard.in links:
Publishing Has Perished: Long Live the Personal Cloud | Cloudline | Wired.com
“Even that arrangement wouldn’t be ideal, though. When a publisher abandons an article of mine, it also abandons links to it from any page that referred to the article. In the scientific and academic realms, digital object identifiers (DOIs) are used to mint namespaces that can transcend the lifetimes (or attention spans) of individual publishers. That technology hasn’t yet trickled down to the rest of us, but I’d love to have my personal cloud implement such a scheme and be the resolver of last resort for my published work.”
via:vielmetti publishing archives extended-mind memory make-it-so- “In this paper, we study optimal radar deployment for intrusion detection, with focus on network coverage. In contrast to the disk-based sensing model in a traditional sensor network, the detection range of a bistatic radar depends on the locations of both the radar transmitter and radar receiver, and is characterized by Cassini ovals. Furthermore, in a network with multiple radar transmitters and receivers, since any pair of transmitter and receiver can potentially form a bistatic radar, the detection ranges of different bistatic radars are coupled and the corresponding network coverage is intimately related to the locations of all transmitters and receivers, making the optimal deployment design highly non-trivial. Clearly, the detectability of an intruder depends on the highest SNR received by all possible bistatic radars. We focus on the worst-case intrusion detectability, i.e., the minimum possible detectability along all possible intrusion paths. Although it is plausible to deploy radars on a shortest line segment across the field, it is not always optimal in general, which we illustrate via counter-examples. We then present a sufficient condition on the field geometry for the optimality of shortest line deployment to hold. Further, we quantify the local structure of detectability corresponding to a given deployment order and spacings of radar transmitters and receivers, building on which we characterize the optimal deployment to maximize the worst-case intrusion detectability. Our results show that the optimal deployment locations exhibit a balanced structure. We also develop a polynomial-time approximation algorithm for characterizing the worse-case intrusion path for any given locations of radars under random deployment.”
optimization simulation nudge-targets coevolution minimax-problems - “Cooperative vehicle safety (CVS) systems operate based on broadcast of vehicle position and safety information to neighboring cars. The communication medium of CVS is a vehicular ad-hoc network. One of the main challenges in large scale deployment of CVS systems is the issue of scalability. To address the scalability problem, several congestion control methods have been proposed and are currently under field study. These algorithms adapt transmission rate and power based on network measures such as channel busy ratio. We examine two such algorithms and study their dynamic behavior in time and space to evaluate stability (in time) and fairness (in space) properties of these algorithms. We present stability conditions and evaluate stability and fairness of the algorithms through simulation experiments. Results show that there is a trade-off between fast convergence, temporal stability and spatial fairness. The proper ranges of parameters for achieving stability are presented for the discussed algorithms. Stability is verified for all typical road density cases. Fairness is shown to be naturally achieved for some algorithms, while under the same conditions other algorithms may suffer from unfairness issues. A method for resolving unfairness is introduced and evaluated through simulations.”
robotics complexology emergent-design traffic-models collective-behavior performance-measure nudge-targets simulation [1205.6147] A curvature-driven effective attraction in multicomponent membranes
“We study closed liquid membranes that segregate into three phases due to differences in the chemical and physical properties of its components. The shape and in-plane membrane arrangement of the phases are coupled through phase-specific bending energies and line tensions. We use simulated annealing Monte Carlo simulations to find low-energy structures, allowing both phase arrangement and membrane shape to relax. The three-phase system is the simplest one in which there are multiple interface pairs, allowing us to analyze interfacial preferences and pairwise distinct line tensions. We observe the system’s preference for interface pairs that maximize differences in spontaneous curvature. From a pattern selection perspective, this acts as an effective attraction between phases of most disparate spontaneous curvature. We show that this effective attraction is robust enough to persist even when the interface between these phases is the most penalized by line tension. This effect is driven by geometry and not by any explicit component-component interaction.”
simulation membrane-physics classification phase-diagrams nudge-targets[1205.2170] Collaborative Search on the Plane without Communication
“We use distributed computing tools to provide a new perspective on the behavior of cooperative biological ensembles. We introduce the Ants Nearby Treasure Search (ANTS) problem, a generalization of the classical cow-path problem, which is relevant for collective foraging in animal groups. In the ANTS problem, k identical (probabilistic) agents, initially placed at some central location, collectively search for a treasure in the two-dimensional plane. The treasure is placed at a target location by an adversary and the goal is to find it as fast as possible as a function of both k and D, where D is the distance between the central location and the target.…”
low-hanging-fruit nudge-targets agent-based autonomous-agents[1206.1571] Steady-state fluctuations of a genetic feedback loop: an exact solution
“…For the case where the degradation rate of bound and free protein is the same, our solution is at variance with a previous claim of an exact solution (Hornos et al, Phys. Rev. E {bf 72}, 051907 (2005) and subsequent studies). We show explicitly that this is due to an unphysical formulation of the underlying master equation in those studies.”
unphysical biochemistry models analytical-models-of-messy-old-life- Something interesting but very difficult in here about representation theory for metaheuristics, and practical (contextual) landscape reconfiguration. “In this paper we consider the problem of locating a nonzero entry in a high-dimensional vector from possibly adaptive linear measurements. We consider a recursive bisection method which we dub the compressive binary search and show that it improves on what any nonadaptive method can achieve. We also establish a non-asymptotic lower bound that applies to all methods, regardless of their computational complexity. Combined, these results show that the compressive binary search is within a double logarithmic factor of the optimal performance.”
needle-in-a-haystack algorithms computational-complexity [1107.0500] Factorization of Matrices of Quaternions
“We review known factorization results in quaternion matrices. Specifically, we derive the Jordan canonical form, polar decomposition, singular value decomposition, the QR factorization. We prove there is a Schur factorization for commuting matrices, and from this derive the spectral theorem. We do not consider algorithms, but do point to some of the numerical literature. Rather than work directly with matrices of quaternions, we work with complex matrices with a specific symmetry based on the dual operation. We discuss related results regarding complex matrices that are self-dual or symmetric, but perhaps not Hermitian.”
quantums algorithms matrices open-questions nudge-targets amusing[1204.0163] Coordination and Emergence in the Cellular Automated Fashion Game
“Fashion plays such a crucial rule in the evolution of culture and society that it is regarded as a second nature to the human being. Also, its impact on economy is quite nontrivial. On what is fashionable, interestingly, there are two viewpoints that are both extremely widespread but almost opposite: conformists think that what is popular is fashionable, while rebels believe that being different is the essence. Fashion color is fashionable in the first sense, and Lady Gaga in the second. We investigate a model where the population consists of the afore-mentioned two groups of people that are located on a spatial structure. Theoretically, this model is equivalent to the matching pennies game on the corresponding network, and has its own interest to game theory: it is a hybrid model of pure competition and pure cooperation. This is true because when a conformist meets a rebel, they play the zero sum matching pennies game, which is pure competition. When two conformists (rebels) meet, they play the (anti-) coordination game, which is pure cooperation. Simulation shows that in most cases people can reach an extraordinarily high degree of cooperation, through selfish, myopic, naive, and local interactions. Phase transition, as well as emergence of many interesting patterns, is also observed.”
cellular-automata agent-based social-dynamics complexology- “We extend the formalism of Random Boolean Networks with canalizing rules to multilevel complex networks. The formalism allows to model genetic networks in which each gene might take part in more than one signaling pathway. We use a semi-annealed approach to study the stability of this class of models when coupled in a multiplex network and show that the analytical results are in good agreement with numerical simulations. Our main finding is that the multiplex structure provides a mechanism for the stabilization of the system and of chaotic regimes of individual layers. Our results help understanding why some genetic networks that are theoretically expected to operate in the chaotic regime can actually display dynamical stability.”
boolean-networks complexology kauffmania interesting [1109.1488] Are Opinions Based on Science: Modelling Social Response to Scientific Facts
Oh how I laughed and laughed. “As scientists we like to think that modern societies and their members base their views, opinions and behaviour on scientific facts.…”
science! cultural-assumptions agent-based simulation social-networks- I hadn’t heard of matrix sketching before; warrants a look someday.
algorithms operations-research data-cleaning summarization-algorithms nudge-targets [1205.0537] A greedy-navigator approach to navigable city plans
“We use a set of four theoretical navigability indices for street maps to investigate the shape of the resulting street networks, if they are grown by optimizing these indices. The indices compare the performance of simulated navigators (having a partial information about the surroundings, like humans in many real situations) to the performance of optimally navigating individuals. We show that our simple greedy shortcut construction strategy generates the emerging structures that are different from real road network, but not inconceivable. The resulting city plans, for all navigation indices, share common qualitative properties such as the tendency for triangular blocks to appear, while the more quantitative features, such as degree distributions and clustering, are characteristically different depending on the type of metrics and routing strategies. We show that it is the type of metrics used which determines the overall shapes characterized by structural heterogeneity, but the routing schemes contribute to more subtle details of locality, which is more emphasized in case of unrestricted connections when the edge crossing is allowed.”
city-planning emergence emergent-design agent-based nudge-targets- “…Then, we propose a new and effective algorithm to generate parity inequalities derived from certain additional redundant parity check (RPC) constraints that can eliminate pseudocodewords produced by the LP decoder, often significantly improving the decoder error-rate performance. The cut-generating algorithm is based upon a specific transformation of an initial parity-check matrix of the linear block code. We also design two variations of the proposed decoder to make it more efficient when it is combined with the new cut-generating algorithm. Simulation results for several low-density parity-check (LDPC) codes demonstrate that the proposed decoding algorithms significantly narrow the performance gap between LP decoding and ML decoding.”
linear-programming information-theory algorithms nudge-targets searching-under-the-streetlight - “Deep shotgun sequencing and analysis of genomes, transcriptomes, amplified single-cell genomes, and metagenomes has enabled investigation of a wide range of organisms and ecosystems. However, sampling variation in short-read data sets and high sequencing error rates of modern sequencers present many new computational challenges in data interpretation. These challenges have led to the development of new classes of mapping tools and {em de novo} assemblers. These algorithms are challenged by the continued improvement in sequencing throughput. We here describe digital normalization, a single-pass computational algorithm that systematizes coverage in shotgun sequencing data sets, thereby decreasing sampling variation, discarding redundant data, and removing the majority of errors. Digital normalization substantially reduces the size of shotgun data sets and decreases the memory and time requirements for {em de novo} sequence assembly, all without significantly impacting content of the generated contigs. We apply digital normalization to the assembly of microbial genomic data, amplified single-cell genomic data, and transcriptomic data. Our implementation is freely available for use and modification.”
genomics bioinformatics algorithms statistics data-cleaning - “In this paper B-Rank, an efficient ranking algorithm for recommender systems, is proposed. B-Rank is based on a random walk model on hypergraphs. Depending on the setup, B-Rank outperforms other state of the art algorithms in terms of precision, recall (19% — 50%), and inter list diversity (20% — 60%). B-Rank captures well the difference between popular and niche objects. The proposed algorithm produces very promising results for sparse and dense voting matrices. Furthermore, a recommendation list update algorithm is introduced,to cope with new votes. This technique significantly reduces computational complexity. The implementation of the algorithm is simple, since B-Rank needs no parameter tuning.”
algorithms peer-production benchmarking amusing [1205.2777] Modelling slowly changing dynamic gene-regulatory networks
“Dynamic gene-regulatory networks are complex since the number of potential components involved in the system is very large. Estimating dynamic networks is an important task because they compromise valuable information about interactions among genes. Graphical models are a powerful class of models to estimate conditional independence among random variables, e.g. interactions in dynamic systems. Indeed, these interactions tend to vary over time. However, the literature has been focused on static networks, which can only reveal overall structures. Time-course experiments are performed in order to tease out significant changes in networks. It is typically reasonable to assume that changes in genomic networks are few because systems in biology tend to be stable. We introduce a new model for estimating slowly changes in dynamic gene-regulatory networks which is suitable for a high-dimensional dataset, e.g. time-course genomic data. Our method is based on i) the penalized likelihood with $ell_1$-norm, ii) the penalized differences between conditional independence elements across time points and iii) the heuristic search strategy to find optimal smoothing parameters. We implement a set of linear constraints necessary to estimate sparse graphs and penalized changing in dynamic networks. These constraints are not in the linear form. For this reason, we introduce slack variables to re-write our problem into a standard convex optimization problem subject to equality linear constraints. We show that GL$_Delta$ performs well in a simulation study. Finally, we apply the proposed model to a time-course genetic dataset T-cell.”
gene-regulatory-networks modeling systems-biology operations-research linear-programming nudge-targets[1205.3532] New Algorithms on Rooted Triplet Consistency
“An evolutionary tree (phylogenetic tree) is a binary, rooted, unordered tree that models the evolutionary history of currently living species in which leaves are labeled by species. In this paper, we investigate the problem of finding the maximum consensus evolutionary tree from a set of given rooted triplets. A rooted triplet is a phylogenetic tree on three leaves and shows the evolutionary relationship of the corresponding three species. The mentioned problem is known to be APX-hard. We present two new heuristic algorithms. For a given set of m triplets on n species, the FastTree algorithm runs in O(mn^2) which is faster than any other previously known algorithms, although, the outcome is less satisfactory. The BPMTR algorithm runs in O(mn^3) and in average performs better than any other previously known approximation algorithms for this problem.”
cladistics algorithms open-questions nudge-targets[1205.3720] A k-shell decomposition method for weighted networks
“One major limitation of most centrality measures, including the k-core decomposition method, is their design to work on unweighted graphs. However, in practice, real networks are weighted, and their weights describe important and well defined properties of the underlying systems. In order to overcome this limitation two main approaches were followed, but, both having drawbacks of their own. Under the first approach one would completely neglect the weights and perform the analysis on the unweighted network, but then one chooses to neglect an important property of the network. The second approach would be to consider only links with weights above some — (usually) arbitrary chosen — threshold value. However, this approach has a drawback since it has to deal with the selection of a proper cut-off value, and as we will discuss later, this could have significant impact on the results. Additionally, by neglecting links bellow a threshold, the network becomes sparser with some nodes getting disconnected and not considered by the applied method afterwards.”
network-theory social-networks algorithms graph-theory[1109.2341] Guaranteed successful strategies for a square achievement game on an n by n grid
“At some places (see the references) Martin Erickson describes a certain game: “Two players alternately write O’s (first player) and X’s (second player) in the unoccupied cells of an n x n grid. The first player (if any) to occupy four cells at the vertices of a square with horizontal and vertical sides is the winner.” Then he asks “What is the outcome of the game given optimal play?” or “What is the smallest n such that the first player has a winning strategy?” ”
nudge-targets game-theory mathematical-recreations[1206.0217] Efficient techniques for mining spatial databases
“Clustering is one of the major tasks in data mining. In the last few years, Clustering of spatial data has received a lot of research attention. Spatial databases are components of many advanced information systems like geographic information systems VLSI design systems. In this thesis, we introduce several efficient algorithms for clustering spatial data. First, we present a grid-based clustering algorithm that has several advantages and comparable performance to the well known efficient clustering algorithm. The algorithm has several advantages. The algorithm does not require many input parameters. It requires only three parameters, the number of the points in the data space, the number of the cells in the grid and a percentage. The number of the cells in the grid reflects the accuracy that should be achieved by the algorithm. The algorithm is capable of discovering clusters of arbitrary shapes. The computational complexity of the algorithm is comparable to the complexity of the most efficient clustering algorithm. The algorithm has been implemented and tested against different ranges of database sizes. The performance results show that the running time of the algorithm is superior to the most well known algorithms (CLARANS [23]). The results show also that the performance of the algorithm do not degrade as the number of the data points increases.”
GIS statistics clustering context-sensitive-data nudge-targets data-mining- “Lexical substitutes have found use in the context of word sense disambiguation, unsupervised part-of-speech induction, paraphrasing, machine translation, and text simplification. Using a statistical language model to find the most likely substitutes in a given context is a successful approach, but the cost of a naive algorithm is proportional to the vocabulary size. This paper presents the Fastsubs algorithm which can efficiently and correctly identify the most likely lexical substitutes for a given context based on a statistical language model without going through most of the vocabulary. The efficiency of Fastsubs makes large scale experiments based on lexical substitutes feasible. For example, it is possible to compute the top 10 substitutes for each one of the 1,173,766 tokens in Penn Treebank in about 6 hours on a typical workstation. The same task would take about 6 days with the naive algorithm. An implementation of the algorithm and a dataset with the top 100 substitutes of each token in the WSJ section of the Penn Treebank are available from the author’s website at this http URL”
linguistics data-cleaning algorithms nudge-targets classification [1204.1002] Fast Multi-Scale Detection of Relevant Communities
“Nowadays, networks are almost ubiquitous. In the past decade, community detection received an increasing interest as a way to uncover the structure of networks by grouping nodes into communities more densely connected internally than externally. Yet most of the effective methods available do not consider the potential levels of organisation, or scales, a network may encompass and are therefore limited. In this paper we present a method compatible with global and local criteria that enables fast multi-scale community detection. The method is derived in two algorithms, one for each type of criterion, and implemented with 6 known criteria. Uncovering communities at various scales is a computationally expensive task. Therefore this work puts a strong emphasis on the reduction of computational complexity. Some heuristics are introduced for speed-up purposes. Experiments demonstrate the efficiency and accuracy of our method with respect to each algorithm and criterion by testing them against large generated multi-scale networks. This study also offers a comparison between criteria and between the global and local approaches.”
social-networks network-theory algorithms community-detection- Now this one is cool.
statistics inverse-problems biochemistry signal-processing algorithms machine-learning nudge-targets inference-of-things-that-aren’t-toys - “…What exactly are we talking about when we say ‘using data’? Steven Johnson wrote an interesting piece in Wired two years ago using New York City 311 call data. Those city subway/train/bus route apps on your smartphone? Possible thanks to city governments opening up their data. The recently released Kauffman Foundation health care report contains plenty of discussion and ideas for both the public and private sector.”
social-entrepreneurship public-policy startups business-development-sortof shadow-economies open-science [1205.4422] 3D-Algorithms of Composed Pursuit Navigation
“The problem of pursuing a moving target is always one of the main topics in navigation. In the literatures, there are two well-known algorithms called Pure Pursuit and Pure Rendezvous navigation in the 3-dimensional space $mathbb{R}^3$. In this paper, these two methods are combined to introduce a novel family of pursuing algorithms called Composed Pursuit Navigation. The Kinematic and geometric properties of this navigation is studied. The trajectories of this new family of algorithms benefit the advantages of two known methods and its prominence is demonstrated in two real examples. Moreover, it is shown that the metric related to the algorithms are given by Matsumoto metrics.”
operations-research algorithms nudge-targets[1205.5975] A Domain-Specific Compiler for Linear Algebra Operations
“We present a prototypical linear algebra compiler that automatically exploits domain-specific knowledge to generate high-performance algorithms. The input to the compiler is a target equation together with knowledge of both the structure of the problem and the properties of the operands. The output is a variety of high-performance algorithms, and the corresponding source code, to solve the target equation. Our approach consists in the decomposition of the input equation into a sequence of library-supported kernels. Since in general such a decomposition is not unique, our compiler returns not one but a number of algorithms. The potential of the compiler is shown by means of its application to a challenging equation arising within the genome-wide association study. As a result, the compiler produces multiple “best” algorithms that outperform the best existing libraries.”
domain-specific-language linear-algebra software-engineering compiler nudge-targets[1206.1098] The interplay of intrinsic and extrinsic bounded noises in genetic networks
“After being considered as a nuisance to be filtered out, it became recently clear that biochemical noise plays a complex role, often fully functional, for a genetic network. The influence of intrinsic and extrinsic noises on genetic networks has intensively been investigated in last ten years, though contributions on the co-presence of both are sparse. Extrinsic noise is usually modeled as an unbounded white or colored gaussian stochastic process, even though realistic stochastic perturbations are clearly bounded. In this paper we consider Gillespie-like stochastic models of nonlinear networks, i.e. the intrinsic noise, where the model jump rates are affected by colored bounded extrinsic noises synthesized by a suitable biochemical state-dependent Langevin system. These systems are described by a master equation, and a simulation algorithm to analyze them is derived. This new modeling paradigm should enlarge the class of systems amenable at modeling. We investigated the influence of both amplitude and autocorrelation time of a extrinsic Sine-Wiener noise on: $(i)$ the Michaelis-Menten approximation of noisy enzymatic reactions, which we show to be applicable also in co-presence of both intrinsic and extrinsic noise, $(ii)$ a model of enzymatic futile cycle and $(iii)$ a genetic toggle switch. In $(ii)$ and $(iii)$ we show that the presence of a bounded extrinsic noise induces qualitative modifications in the probability densities of the involved chemicals, where new modes emerge, thus suggesting the possibile functional role of bounded noises.”
biochemistry structural-biology reaction-networks biological-engineering noise its-complicated-inside-a-cell simulation nudge-targets[1205.3058] A Tight Lower Bound on the Controllability of Networks with Multiple Leaders
“In this paper we study the controllability of networked systems with static network topologies using tools from algebraic graph theory. Each agent in the network acts in a decentralized fashion by updating its state in accordance with a nearest-neighbor averaging rule, known as the consensus dynamics. In order to control the system, external control inputs are injected into the so called leader nodes, and the influence is propagated throughout the network. Our main result is a tight topological lower bound on the rank of the controllability matrix for such systems with arbitrary network topologies and possibly multiple leaders.”
network-theory algorithms emergent-design nudge-targets- “However, when the game has no clear objectives, no met– ric exists to measure player contribution quality. Indeed, each player may have a different personal motivation to achieve dif– ferent self-imposed goals [4], and player actions can be con– sidered fair or disruptive towards the community depending on whether they respect or damage other player contributions. In these cases, there is a very abstract and subjective shared implicit objective that could be described as building a fair and not disruptive player community. It should be noted that fair players benefit from their behavior, as it is more likely that other players act fair towards them. Furthermore, a com– munity of disruptive players seems to repel fair players and the community quality has an intuitive tendency to gradually drop off. Contrarily, a community of fair players lures new fair players, which lead, in turn, to an increase of the commu– nity quality.”
social-dynamics game-theory collaboration performance-measure teams ranking-schemes agile-management to-explore [1205.3648] 6-Body Central Configurations Formed by Two Isosceles Triangles
“In this paper,we show the existence of a class of 6-body central configurations with two isosceles triangles; which are congruent to each other and keep some distance.We also study the necessary conditions about masses for the bodies which can form a central configuration.”
N-body-problems simulation mathematical-recreations nudge-targets special-cases inverse-problems-done-backwards[1206.1074] Memetic Artificial Bee Colony Algorithm for Large-Scale Global Optimization
People who misunderstand the difference between a “program”, a “design pattern” and an “algorithm”, I’m thinking. That said, an interesting camel’s nose for getting more contextual narrative and less ridiculous overgeneralization (even accidentally) in an engineering paper.…
algorithms programming subjective-objective-decontextualization-bias rather-interesting[1205.2200] A Greedy Double Swap Heuristic for Nurse Scheduling
“One of the key challenges of nurse scheduling problem (NSP) is the number of constraints placed on preparing the timetable, both from the regulatory requirements as well as the patients’ demand for the appropriate nursing care specialists. In addition, the preferences of the nursing staffs related to their work schedules add another dimension of complexity. Most solutions proposed for solving nurse scheduling involve the use of mathematical programming and generally considers only the hard constraints. However, the psychological needs of the nurses are ignored and this resulted in subsequent interventions by the nursing staffs to remedy any deficiency and often results in last minute changes to the schedule. In this paper, we present a staff preference optimization framework which is solved with a greedy double swap heuristic. The heuristic yields good performance in speed at solving the problem. The heuristic is simple and we will demonstrate its performance by implementing it on open source spreadsheet software.”
scheduling operations-research heuristics performance-measure nudge-targets- “Max gives you the parts to create unique sounds, stunning visuals, and engaging interactive media. These parts are called ‘objects’ – visual boxes that contain tiny programs to do something specific. Each object does something different. Some make noises, some make video effects, others just do simple calculations or make decisions. In Max you add objects to a visual canvas and connect them together with patchcords. You can use as many as you like. By combining objects, you create interactive and unique software without ever writing any code (you can do that too if you really want to). Just connect.”
visual-programming genetic-programming-target generative-art software languages [1205.3397] 1.85 Approximation for Min-Power Strong Connectivity
“Given a directed simple graph G=(V,E) and a nonnegative-valued cost function the power of a vertex u in a directed spanning subgraph H is given by the maximum cost of an arcs of H exiting u. The power of H is the sum of the power of its vertices. Power Assignment seeks to minimize the power of H while H satisfies some connectivity constraint. In this paper, we assume E is bidirected (for every directed edge e in E, the opposite edge exists and has the same cost), while H is required to be strongly connected. This is the original power assignment problem introduced by Chen and Huang in 1989, who proved that bidirected minimum spanning tree has approximation ratio at most 2 (this is tight). In Approx 2010, we introduced a Greedy approximation algorithm and claimed a ratio of 1.992. Here we improve the analysis to 1.85.”
algorithms computational-complexity operations-research nudge-targets[1206.1106] No More Pesky Learning Rates
“The performance of stochastic gradient descent (SGD) depends critically on how learning rates are tuned and decreased over time. We propose a method to automatically adjust multiple learning rates so as to minimize the expected error at any one time. The method relies on local gradient variations across samples. Using a number of convex and non-convex learning tasks, we show that the resulting algorithm matches the performance of the best settings obtained through systematic search, and effectively removes the need for learning rate tuning.”
optimization algorithms gradient-descent adaptive-control silos-in-action- “Getting better at software development can be hard work. There is an endless sea of learning materials out there, but just figuring out what topics you should focus on could take up all of your time if you let it. Don’t fall into that trap, instead, let me do the legwork for you! As a subscriber to Practicing Ruby, you’ll get access to well-polished weekly brain dumps about topics that will help you become a better Ruby developer. You’ll also be able to join a dedicated group of Practicing Rubyists in lively conversations about the eclectic mix of topics I’m writing about.”
Ruby programming tutorial subscriptions to-read [1206.1103] Periodizing quasicrystals: Anomalous diffusion in quasiperiodic systems
“We introduce a construction to embed a quasiperiodic lattice of obstacles into a single unit cell of a higher-dimensional space, with periodic boundary conditions. This construction transparently shows the existence of channels in these systems,in which particles may travel without colliding, up to a critical obstacle radius. It provides a simple and efficient algorithm for numerical simulation of dynamics in quasiperiodic structures, as well as giving a natural notion of uniform distribution (measure) and averages. As an application, we simulate diffusion in a two-dimensional quasicrystal, finding three different regimes, in particular atypical weak super-diffusion in the presence of channels, and sub-diffusion when obstacles overlap.”
quasicrystals tiling algorithms topology simulation computational-geometry[1205.3193] A Comparative Study of Collaborative Filtering Algorithms
“Collaborative filtering is a rapidly advancing research area. Every year several new techniques are proposed and yet it is not clear which of the techniques work best and under what conditions. In this paper we conduct a study comparing several collaborative filtering techniques — both classic and recent state-of-the-art — in a variety of experimental contexts. Specifically, we report conclusions controlling for number of items, number of users, sparsity level, performance criteria, and computational complexity. Our conclusions identify what algorithms work well and in what conditions, and contribute to both industrial deployment collaborative filtering algorithms and to the research community.”
overview collaborative-filtering performance-space prediction algorithms swarms- Genetic Programming explained. [according to most folks]
genetic-programming Lost self-definition you-keep-using-that-word [1204.6170] A distributed resource allocation algorithm for many processes
“Resource allocation is the problem that a process may enter a critical section CS of its code only when its resource requirements are not in conflict with those of other processes in their critical sections. For each execution of CS, these requirements are given anew. In the resource requirements, levels can be distinguished, such as e.g. read access or write access. We allow infinitely many processes that communicate by reliable asynchronous messages and have finite memory. A simple starvation-free solution is presented. Processes only wait for one another when they have conflicting resource requirements. The correctness of the solution is argued with invariants and temporal logic. It has been verified with the proof assistant PVS.”
distributed-processing concurrency nudge-targets algorithms modeling[1205.5911] A Hough Transform Approach to Solving Linear Min-Max Problems
“Several ways to accelerate the solution of 2D/3D linear min-max problems in $n$ constraints are discussed. We also present an algorithm for solving such problems in the 2D case, which is superior to CGAL’s linear programming solver, both in performance and in stability.”
algorithms computational-geometry nudge-targets convex-hulls linear-programming[1205.2664] A Bayesian Sampling Approach to Exploration in Reinforcement Learning
“We present a modular approach to reinforcement learning that uses a Bayesian representation of the uncertainty over models. The approach, BOSS (Best of Sampled Set), drives exploration by sampling multiple models from the posterior and selecting actions optimistically. It extends previous work by providing a rule for deciding when to resample and how to combine the models. We show that our algorithm achieves nearoptimal reward with high probability with a sample complexity that is low relative to the speed at which the posterior distribution converges during learning. We demonstrate that BOSS performs quite favorably compared to state-of-the-art reinforcement-learning approaches and illustrate its flexibility by pairing it with a non-parametric model that generalizes across states.”
exploration-and-exploitation algorithms machine-learning reinforcement-learning integrate-method-into-GP- “The new media folks desperately want to write for some hypothetical audience, one they can find the center of. They are like border collies, wired to herd sheep and frantic if they can’t find any.”
disintermediation-in-action journalism public-policy pollsters cultural-assumptions - “This paper studies parallelization schemes for stochastic Vector Quantization algorithms in order to obtain time speed-ups using distributed resources. We show that the most intuitive parallelization scheme does not lead to better performances than the sequential algorithm. Another distributed scheme is therefore introduced which obtains the expected speed-ups. Then, it is improved to fit implementation on distributed architectures where communications are slow and inter-machines synchronization too costly. The schemes are tested with simulated distributed architectures and, for the last one, with Microsoft Windows Azure platform obtaining speed-ups up to 32 Virtual Machines.”
algorithms distributed-processing parallelization nudge-targets Crypto breakthrough shows Flame was designed by world-class scientists | Ars Technica
“More interestingly, the results have shown that not our published chosen-prefix collision attack was used, but an entirely new and unknown variant,” Stevens wrote in a statement distributed on Thursday. “This has led to our conclusion that the design of Flame is partly based on world-class cryptanalysis. Further research will be conducted to reconstruct the entire chosen-prefix collision attack devised for Flame.“ The analysis reinforces theories that researchers from Kaspersky Lab, CrySyS Lab, and Symantec published almost two weeks ago. Namely, Flame could only have been developed with the backing of a wealthy nation-state. Stevens’ and de Weger’s conclusion means that, in addition to a team of engineers who developed a global malware platform that escaped detection for at least two years, Flame also required world-class cryptographers who have broken new ground in their field. “It’s not a garden-variety collision attack, or just an implementation of previous MD5 collisions papers—which would be difficult enough,” Matthew Green, a professor specializing in cryptography in the computer science department at Johns Hopkins University, told Ars. “There were mathematicians doing new science to make Flame work.”
cryptography MD5-woops algorithms nudge-targetsOn Outreach: something’s got to give | Neurotic Physiology
“A change in the way academia views outreach is going to take a while, and it may take even longer for science communication to become a really respected “alternative” career. In the meantime, many scientists DO do outreach. They go into schools, they give talks at bars, they talk to their friends and family. Some of them send me and other science bloggers articles (THANK YOU!! And you know, never hesitate to send me an article!) to cover, or speak out proudly in support of their work. There IS science outreach out there, and a lot of it is GREAT.”
science! academic-culture disintermediation-targets-who-have-noticed- “Many open problems remain. Some are of a fundamental nature. What does nature allow us to compute efficiently? What does nature allow us to make secure? Others are of a more practical nature. How will we build scalable quantum computers? For what problems are there effective quantum algorithms? How broad an impact will quantum information processing have? At the very least, quantum computation, and quantum information processing more generally, has changed forever how humanity thinks about and works with physics, computation, and information.”
quantums quantum-computing overview [1205.6867] Minimizing the average distance to a closest leaf in a phylogenetic tree
“In this paper we described a simple new criterion, minimizing the Average Dis– tance to the Closest Leaf (ADCL), for finding a subset of sequences that either represent the diversity of the sequences in a sample, or are close on average to a set of query sequences. In doing so, abundance information is taken into account and attempt to strike a balance between optimality and centrality in the tree. In particular, this criterion is the only way in which we are aware to pick sequences that are phylogenetically close on average to a set of query sequences. We have also investigated means of minimizing the ADCL, including a heuristic that performs well in practice and an exact dynamic program. ADCL minimization appears to avoid picking chimeric sequences in simulation.”
algorithms phylogenetics cladistics performance-measure nudge-targets bioinformatics- “How do I run a command from history? Type some part of the command, and then hit the up or down arrow keys to navigate through history matches.”
want unix shell - “Many models for sparse regression typically assume that the covariates are known completely, and without noise. Particularly in high-dimensional applications, this is often not the case. This paper develops efficient OMP-like algorithms to deal with precisely this setting. Our algorithms are as efficient as OMP, and improve on the best-known results for missing and noisy data in regression, both in the high-dimensional setting where we seek to recover a sparse vector from only a few measurements, and in the classical low-dimensional setting where we recover an unstructured regressor. In the high-dimensional setting, our support-recovery algorithm requires no knowledge of even the statistics of the noise. Along the way, we also obtain improved performance guarantees for OMP for the standard sparse regression problem with Gaussian noise.”
statistics algorithms nudge-targets performance-measure cultural-assumptions-of-statistics - “We study the inverse power index problem for weighted voting games: the problem of finding a weighted voting game in which the power of the players is as close as possible to a certain target distribution. Our goal is to find algorithms that solve this problem exactly. Thereto, we study various subclasses of simple games, and their associated representation methods. We survey algorithms and impossibility results for the synthesis problem, i.e., converting a representation of a simple game into another representation. We contribute to the synthesis problem by showing that it is impossible to compute in polynomial time the list of ceiling coalitions (also known as shift-maximal losing coalitions) of a game from its list of roof coalitions (also known as shift-minimal winning coalitions), and vice versa. ”
inverse-problems algorithms game-theory voting nudge-targets [1206.0937] Detecting Activations over Graphs using Spanning Tree Wavelet Bases
“This paper focuses on the problem of detecting activations over a graph when observations are corrupted by noise. The problem of detecting graph-structured activations is relevant to many applications including identifying congestion in router and road networks, eliciting preferences in social networks, and detecting viruses in human and computer networks. Furthermore, these applications require that the method is scalable to large graphs. Luckily, computer science boasts a plethora of efficient graph based algorithms that we can adapt to the detection framework.”
statistics network-theory algorithms pattern-discovery inverse-problems nudge-targets[1206.1270] Factoring nonnegative matrices with linear programs
“This paper describes a new approach for computing nonnegative matrix factorizations (NMFs) with linear programming. The key idea is a data-driven model for the factorization, in which the most salient features in the data are used to express the remaining features. More precisely, given a data matrix X, the algorithm identifies a matrix C that satisfies X is approximately equal to CX and some linear constraints. The matrix C selects features, which are then used to compute a low-rank NMF of X. A theoretical analysis demonstrates that this approach has the same type of guarantees as the recent NMF algorithm of Arora et al. (2012). In contrast with this earlier work, the proposed method (1) has better noise tolerance, (2) extends to more general noise models, and (3) leads to efficient, scalable algorithms. Experiments with synthetic and real datasets provide evidence that the new approach is also superior in practice. An optimized C++ implementation of the new algorithm can factor a multi-Gigabyte matrix in a matter of minutes.”
via:cshalizi algorithms linear-programming nudge-targets[1206.1032] Frequent Patterns mining in time-sensitive Data Stream
“Mining frequent itemsets through static Databases has been extensively studied and used and is always considered a highly challenging task. For this reason it is interesting to extend it to data streams field. In the streaming case, the frequent patterns’ mining has much more information to track and much greater complexity to manage. Infrequent items can become frequent later on and hence cannot be ignored. The output structure needs to be dynamically incremented to reflect the evolution of itemset frequencies over time. In this paper, we study this problem and specifically the methodology of mining time-sensitive data streams. We tried to improve an existing algorithm by increasing the temporal accuracy and discarding the out-of-date data by adding a new concept called the “Shaking Point”. We presented as well some experiments illustrating the time and space required.”
pattern-discovery time-series data-mining algorithms trading nudge-targets[1206.0580] O(1) Delta Component Computation Technique for the Quadratic Assignment Problem
“The paper describes a novel technique that allows to reduce by half the number of delta values that were required to be computed with complexity O(N) in most of the heuristics for the quadratic assignment problem. Using the correlation between the old and new delta values, obtained in this work, a new formula of complexity O(1) is proposed. Found result leads up to 25% performance increase in such well-known algorithms as Robust Tabu Search and others based on it.”
algorithms combinatorics operations-research quadratic-assignment nudge-targets- “Dynamical modelling lies at the heart of our understanding of physical systems. Its role in science is deeper than mere operational forecasting, in that it allows us to evaluate the adequacy of the mathematical structure of our models. Despite the importance of model parameters, there is no general method of parameter estimation outside linear systems. A new relatively simple method of parameter estimation for nonlinear systems is presented, based on variations in the accuracy of probability forecasts. It is illustrated on the Logistic Map, the Henon Map and the 12-D Lorenz96 flow, and its ability to outperform linear least squares in these systems is explored at various noise levels and sampling rates. As expected, it is more effective when the forecast error distributions are non-Gaussian. The new method selects parameter values by minimizing a proper, local skill score for continuous probability forecasts as a function of the parameter values. This new approach is easier to implement in practice than alternative nonlinear methods based on the geometry of attractors or the ability of the model to shadow the observations. New direct measures of inadequacy in the model, the “Implied Ignorance” and the information deficit are introduced.”
statistics symbolic-regression parameter-estimation algorithms nudge-targets - “…And listen, trust me, even if you do not feel that way at the end of these years, even if you are feeling burned out and done with the vagaries of social user design interaction universal community-driven agile information experience, even if you are ready to close your laptop screen forever, you are beloved on the earth. You are skilled and talented and young and bright and accredited. The world wishes it were you.”
advice speech [1206.0629] DEMON: a Local-First Discovery Method for Overlapping Communities
“Community discovery in complex networks is an interesting problem with a number of applications, especially in the knowledge extraction task in social and information networks. However, many large networks often lack a particular community organization at a global level. In these cases, traditional graph partitioning algorithms fail to let the latent knowledge embedded in modular structure emerge, because they impose a top-down global view of a network. We propose here a simple local-first approach to community discovery, able to unveil the modular organization of real complex networks. This is achieved by democratically letting each node vote for the communities it sees surrounding it in its limited view of the global system, i.e. its ego neighborhood, using a label propagation algorithm; finally, the local communities are merged into a global collection. We tested this intuition against the state-of-the-art overlapping and non-overlapping community discovery methods, and found that our new method clearly outperforms the others in the quality of the obtained communities, evaluated by using the extracted communities to predict the metadata about the nodes of several real world networks. We also show how our method is deterministic, fully incremental, and has a limited time complexity, so that it can be used on web-scale real networks.”
network-theory algorithms community-discovery statistics explanation referral-networks[1206.0430] Congestion Games on Weighted Directed Graphs, with Applications to Spectrum Sharing
“With the advance of complex large-scale networks, it is becoming increasingly important to understand how selfish and spatially distributed individuals will share network resources without centralized coordinations. In this paper, we introduce the graphical congestion game with weighted edges (GCGWE) as a general theoretical model to study this problem. In GCGWE, we view the players as vertices in a weighted graph. The amount of negative impact (e.g. congestion) caused by two close-by players to each other is determined by the weight of the edge linking them. The GCGWE unifies and significantly generalizes several simpler models considered in the previous literature, and is well suited for modeling a wide range of networking scenarios. One good example is to use the GCGWE to model spectrum sharing in wireless networks, where we can properly define the edge weights and payoff functions to capture the rather complicated interference relationship between wireless nodes. By identifying which GCGWEs possess pure Nash equilibria and the very desirable finite improvement property, we gain insight into when spatially distributed wireless nodes will be able to self-organize into a mutually acceptable resource allocation. We also consider the efficiency of the pure Nash equilibria, and the computational complexity of finding them.”
network-theory algorithms agent-based nudge-targets complexology[1206.0855] A Mixed Observability Markov Decision Process Model for Musical Pitch
“Partially observable Markov decision processes have been widely used to provide models for real-world decision making problems. In this paper, we will provide a method in which a slightly different version of them called Mixed observability Markov decision process, MOMDP, is going to join with our problem. Basically, we aim at offering a behavioural model for interaction of intelligent agents with musical pitch environment and we will show that how MOMDP can shed some light on building up a decision making model for musical pitch conveniently.”
music machine-learning generative-art pattern-discovery nudge-targets[1206.0766] Why Optimal States Recruit Fewer Reactions in Metabolic Networks
“The metabolic network of a living cell involves several hundreds or thousands of interconnected biochemical reactions. Previous research has shown that under realistic conditions only a fraction of these reactions is concurrently active in any given cell. This is partially determined by nutrient availability, but is also strongly dependent on the metabolic function and network structure. Here, we establish rigorous bounds showing that the fraction of active reactions is smaller (rather than larger) in metabolic networks evolved or engineered to optimize a specific metabolic task, and we show that this is largely determined by the presence of thermodynamically irreversible reactions in the network. We also show that the inactivation of a certain number of reactions determined by irreversibility can generate a cascade of secondary reaction inactivations that propagates through the network. The mathematical results are complemented with numerical simulations of the metabolic networks of the bacterium Escherichia coli and of human cells, which show, counterintuitively, that even the maximization of the total reaction flux in the network leads to a reduced number of active reactions.”
network-theory biological-engineering metabolic-networks systems-biology engineering-design structural-biology- “But the ideas originally behind that trope — now that’s the cool part. My friends who work in aerospace tell me the old guys who built the industry all grew up reading Heinlein and Clarke, and went into aerospace to turn those crazy things they read as kids into practical realities as adults. Well, I work in supercomputing, and I can assure you that this industry is full of young geniuses who grew up reading Gibson, Vinge, and Rucker — and yes, me — and they went into this field to do the same thing. We don’t quite live in the world that cyberpunk fiction predicted. But we live in the world that the kids who grew up reading cyberpunk fiction built, and that is a very cool thing indeed.”
cyberpunk fiction genre retrospective science-fiction cultural-norms