These are my recent Pinboard.in links:
- “…To cluster sets of histogram data, we propose to use Dynamic Clustering Algorithm, (based on adaptive squared Wasserstein distances) that is a k-means-like algorithm for clustering a set of individuals into K classes that are apriori fixed. The main aim of this research is to provide a tool for clustering histograms, emphasizing the different contributions of the histogram variables, and their components, to the definition of the clusters. We demonstrate that this can be achieved using adaptive distances. Two kind of adaptive distances are considered: the first takes into account the variability of each component of each descriptor for the whole set of individuals; the second takes into account the variability of each component of each descriptor in each cluster. We furnish interpretative tools of the obtained partition based on an extension of the classical measures (indexes) to the use of adaptive distances in the clustering criterion function. Applications on synthetic and real-world data corroborate the proposed procedure.”
classification statistics histograms metrics clustering - “Biology presents many examples of planar distribution and structural networks having dense sets of closed loops. An archetype of this form of network organization is the vasculature of dicotyledonous leaves, which showcases a hierarchically-nested architecture containing closed loops at many different levels. Although a number of methods have been proposed to measure aspects of the structure of such networks, a robust metric to quantify their hierarchical organization is still lacking. We present an algorithmic framework, the hierarchical loop decomposition, that allows mapping loopy networks to binary trees, preserving in the connectivity of the trees the architecture of the original graph. We apply this framework to investigate computer generated graphs, such as artificial models and optimal distribution networks, as well as natural graphs extracted from digitized images of dicotyledonous leaves and vasculature of rat cerebral neocortex. We calculate various metrics based on the Asymmetry, the cumulative size distribution and the Strahler bifurcation ratios of the corresponding trees and discuss the relationship of these quantities to the architectural organization of the original graphs. This algorithmic framework decouples the geometric information (exact location of edges and nodes) from the metric topology (connectivity and edge weight) and it ultimately allows us to perform a quantitative statistical comparison between predictions of theoretical models and naturally occurring loopy graphs.”
complexology biophysics network-theory metrics - “Using memristive properties common for the titanium dioxide thin film devices, we designed a simple write algorithm to tune device conductance at a specific bias point to 1% relative accuracy (which is roughly equivalent to 7-bit precision) within its dynamic range even in the presence of large variations in switching behavior. The high precision state is nonvolatile and the results are likely to be sustained for nanoscale memristive devices because of the inherent filamentary nature of the resistive switching. The proposed functionality of memristive devices is especially attractive for analog computing with low precision data. As one representative example we demonstrate hybrid circuitry consisting of CMOS summing amplifier and two memristive devices to perform analog multiply and accumulate computation, which is a typical bottleneck operation in information processing.”
memristors engineering-design simulation control-systems nudge-targets [1110.1521] Nodal domains of a non-separable problem — the right angled isosceles triangle
“Our result may be generalized to other domains where similar algorithms may apply. Our algorithm is based on the fact that the eigenfunctions are presented as a linear combination of simple plane waves. It is therefore tempting to try and generalize it for other drums with similar property. The equilateral triangle is an immediate candidate (see [29] and references within). A further, and quite surprising, result is the recursive formula for the number of nodal loops. To our knowledge this is the first known exact formula for the nodal count of a non-separable planar manifold (for certain eigenfunctions of tori exact formulas have been given in [22]). The formula was found by direct inspection of large tables and has been verified for a large bulk of data computationally. An obvious challenge is to prove this formula. In particular, the recursive part of the formula resembles the famous Euclid algorithm for the greatest common divisor. A further investigation of the mentioned formula might therefore expose some new number theoretical properties of the nodal count.”
physics algorithms analytical-results open-questions geometry acoustics exact-form nudge-targets[1110.1485] A Face Recognition Scheme using Wavelet Based Dominant Features
“In this paper, a multi-resolution feature extraction algorithm for face recognition is proposed based on two-dimensional discrete wavelet transform (2D-DWT), which efficiently exploits the local spatial variations in a face image. For the purpose of feature extraction, instead of considering the entire face image, an entropy-based local band selection criterion is developed, which selects high-informative horizontal segments from the face image. In order to capture the local spatial variations within these highinformative horizontal bands precisely, the horizontal band is segmented into several small spatial modules. Dominant wavelet coefficients corresponding to each local region residing inside those horizontal bands are selected as features. In the selection of the dominant coefficients, a threshold criterion is proposed, which not only drastically reduces the feature dimension but also provides high within-class compactness and high between-class separability. A principal component analysis is performed to further reduce the dimensionality of the feature space. Extensive experimentation is carried out upon standard face databases and a very high degree of recognition accuracy is achieved by the proposed method in comparison to those obtained by some of the existing methods.”
face-recognition algorithms image-processing wavelets nudge-targets[1110.1553] Hierarchical QR factorization algorithms for multi-core cluster systems
“This paper describes a new QR factorization algorithm which is especially designed for massively parallel platforms combining parallel distributed multi-core nodes. These platforms make the present and the foreseeable future of high-performance computing. Our new QR factorization algorithm falls in the category of the tile algorithms which naturally enables good data locality for the sequential kernels executed by the cores (high sequential performance), low number of messages in a parallel distributed setting (small latency term), and fine granularity (high parallelism).”
parallel-computing operations-research factorization algorithms nudge-targets meta-algorithms- “Graph coloring is used in wireless networks to optimize network resources: bandwidth and energy. Nodes access the medium according to their color. It is the responsibility of the coloring algorithm to ensure that interfering nodes do not have the same color. In this research report, we focus on wireless sensor networks with grid topologies. How does a coloring algorithm take advantage of the regularity of grid topology to provide an optimal periodic coloring, that is a coloring with the minimum number of colors? We propose the Vector-Based Coloring Method, denoted VCM, a new method that is able to provide an optimal periodic coloring for any radio transmission range and for any h-hop coloring, h>=1. This method consists in determining at which grid nodes a color can be reproduced without creating interferences between these nodes while minimizing the number of colors used. We compare the number of colors provided by VCM with the number of colors obtained by a distributed coloring algorithm with line and column priority assignments. We also provide bounds on the number of colors of optimal general colorings of the infinite grid, and show that periodic colorings (and thus VCM) are asymptotically optimal. Finally, we discuss the applicability of this method to a real wireless network.”
graph-theory algorithms operations-research nudge-targets [1110.1580] A Polylogarithmic-Competitive Algorithm for the k-Server Problem
“We give the first polylogarithmic-competitive randomized online algorithm for the $k$-server problem on an arbitrary finite metric space. In particular, our algorithm achieves a competitive ratio of O(log^3 n log^2 k log log n) for any metric space on n points. Our algorithm improves upon the deterministic (2k-1)-competitive algorithm of Koutsoupias and Papadimitriou [J.ACM’95] whenever n is sub-exponential in k.”
scheduling operations-research algorithms nudge-targets[1110.1590] PSA: The Packet Scheduling Algorithm for Wireless Sensor Networks
“The main cause of wasted energy consumption in wireless sensor networks is packet collision. The packet scheduling algorithm is therefore introduced to solve this problem. Some packet scheduling algorithms can also influence and delay the data transmitting in the real-time wireless sensor networks. This paper presents the packet scheduling algorithm (PSA) in order to reduce the packet congestion in MAC layer leading to reduce the overall of packet collision in the system The PSA is compared with the simple CSMA/CA and other approaches using network topology benchmarks in mathematical method. The performances of our PSA are better than the standard (CSMA/CA). The PSA produces better throughput than other algorithms. On other hand, the average delay of PSA is higher than previous works. However, the PSA utilizes the channel better than all algorithms.”
sensor-networks distributed-processing scheduling routing operations-research algorithms nudge-targets[1110.0725] A Survey of Distributed Data Aggregation Algorithms
“Distributed data aggregation has been an active field of research in the last decade, and a huge diverse amount of techniques can be found in the literature. For this reasons, this survey intends to be an important time saving instrument, for those that desire to get a quick and comprehensive overview of the state of the art on distributed data aggregation. Moreover, by carefully highlighting the strength and limitations of the more pertinent approaches, this study can provide a useful assistance to help readers choose which technique to apply in specific settings. Currently, there is no ideal general solution to the distributed computation of an aggregation function, all existing techniques have its pitfalls (some more than others). Therefore, more research in this field will be expected in the next few years. In particular, due to the added value of computing complex aggregates, new algorithms might arise to estimate the statistical distribution of values, as the few existing approaches exhibit some limitations in terms of accuracy and resource consumption. Additional research efforts should be made to improve the support to churn, message loss, and continuous estimation of mutable input values.”
statistics reviews distributed-processing communication coordination nudge-targets- “Network or graph structures are ubiquitous in the study of complex systems. Often, we are interested in complexity trends of these system as it evolves under some dynamic. An example might be looking at the complexity of a food web as species enter an ecosystem via migration or speciation, and leave via extinction. In a previous paper, a complexity measure of networks was proposed based on the {em complexity is information content} paradigm. To apply this paradigm to any object, one must fix two things: a representation language, in which strings of symbols from some alphabet describe, or stand for the objects being considered; and a means of determining when two such descriptions refer to the same object. With these two things set, the information content of an object can be computed in principle from the number of equivalent descriptions describing a particular object. The previously proposed representation language had the deficiency that the fully connected and empty networks were the most complex for a given number of nodes. A variation of this measure, called zcomplexity, applied a compression algorithm to the resulting bitstring representation, to solve this problem. Unfortunately, zcomplexity proved too computationally expensive to be practical. In this paper, I propose a new representation language that encodes the number of links along with the number of nodes and a representation of the linklist. This, like zcomplexity, exhibits minimal complexity for fully connected and empty networks, but is as tractable as the original measure.”
network-theory complexology complex-systems measurement perform structure-function-relations discrete-mathematics - “Two different conceptions of emergence are reconciled as two instances of the phenomenon of detection. In the process of comparing these two conceptions, we find that the notions of complexity and detection allow us to form a unified definition of emergence that clearly delineates the role of the observer.”
complexology emergence pragmatism-it-ain’t but-soon - “This paper addresses weight optimization problem in distributed consensus averaging algorithm over networks with symmetric star topology. We have determined optimal weights and convergence rate of the network in terms of its topological parameters. In addition, two alternative topologies with more rapid convergence rates have been introduced. The new topologies are Complete-Cored Symmetric (CCS) star and K-Cored Symmetric (KCS) star topologies. It has been shown that the optimal weights for the edges of central part in symmetric and CCS star configurations are independent of their branches. By simulation optimality of obtained weights under quantization constraints have been verified.”
operations-research decision-making network-theory nudge-targets [1109.5389] Water drives peptide conformational transitions
“Transitions between metastable conformations of a dipeptide are investigated using classical molecular dynamics simulation with explicit water molecules. The distribution of the surrounding water at different moments before the transitions and the dynamical correlations of water with the peptide’s configurational motions indicate that water is the main driving force of the conformational changes.”
molecular-design systems-biology simulation intracellular-dynamics kinda-knew-this-a-long-time-ago biochemistry[1105.1445] Vehicular traffic flow at an intersection with the possibility of turning
“We have developed a Nagel-Schreckenberg cellular automata model for describing of vehicular traffic flow at a single intersection. A set of traffic lights operating in fixed-time scheme controls the traffic flow. Open boundary condition is applied to the streets each of which conduct a uni-directional flow. Streets are single-lane and cars can turn upon reaching to the intersection with prescribed probabilities. Extensive Monte Carlo simulations are carried out to find the model flow characteristics. In particular, we investigate the flows dependence on the signalisation parameters, turning probabilities and input rates. It is shown that for each set of parameters, there exist a plateau region inside which the total outflow from the intersection remains almost constant. We also compute total waiting time of vehicles per cycle behind red lights for various control parameters.”
cellular-automata complexology traffic-models agent-based simulation nudge-substrates[1110.1391] A Comparison of Different Machine Transliteration Models
“Machine transliteration is a method for automatically converting words in one language into phonetically equivalent ones in another language. Machine transliteration plays an important role in natural language applications such as information retrieval and machine translation, especially for handling proper nouns and technical terms. Four machine transliteration models — grapheme-based transliteration model, phoneme-based transliteration model, hybrid transliteration model, and correspondence-based transliteration model — have been proposed by several researchers. To date, however, there has been little research on a framework in which multiple transliteration models can operate simultaneously. Furthermore, there has been no comparison of the four models within the same framework and using the same data. We addressed these problems by 1) modeling the four models within the same framework, 2) comparing them under the same conditions, and 3) developing a way to improve machine transliteration through this comparison. Our comparison showed that the hybrid and correspondence-based models were the most effective and that the four models can be used in a complementary manner to improve machine transliteration performance.”
natural-language-processing machine-learning review nudge-targets- “In this paper we propose numerical measures for evaluating the aesthetic interest of simple patterns. The patterns consist of elements (symbols, pixels, etc.) in regular square arrays. The measures depend on two characteristics of the patterns: the number of different types of element, and the number of symmetries in their arrangement. We define two complementary composite measures L and C for the degree of pattern in a design, and compute them here for 2×2 and 6×6 arrays. The results distinguish simple from high-variation cases. We suspect that the measure L corresponds to the degree that human beings intuitively feel a design to be “interesting”, so this model would aid in quantifying the visual connection of two– dimensional designs with viewers. The other composite measure C based on these numerical properties characterizes the extent of randomness of an array. Combining symbol variety with symmetry calculations allows us to employ hierarchical scaling to count the relative impact of different levels of scale. By identifying substructures we can distinguish between organized patterns and disorganized complexity. The measures described here are related to verbal descriptors derived from work by psychologists on responses to visual environments.”
cognition aesthetics experimental-psychology nudge-targets learning-by-watching [1106.5264] Acquiring Correct Knowledge for Natural Language Generation
“Natural language generation (NLG) systems are computer software systems that produce texts in English and other human languages, often from non-linguistic input data. NLG systems, like most AI systems, need substantial amounts of knowledge. However, our experience in two NLG projects suggests that it is difficult to acquire correct knowledge for NLG systems; indeed, every knowledge acquisition (KA) technique we tried had significant problems. In general terms, these problems were due to the complexity, novelty, and poorly understood nature of the tasks our systems attempted, and were worsened by the fact that people write so differently. This meant in particular that corpus-based KA approaches suffered because it was impossible to assemble a sizable corpus of high-quality consistent manually written texts in our domains; and structured expert-oriented KA techniques suffered because experts disagreed and because we could not get enough information about special and unusual cases to build robust systems. We believe that such problems are likely to affect many other NLG systems as well. In the long term, we hope that new KA techniques may emerge to help NLG system builders. In the shorter term, we believe that understanding how individual KA techniques can fail, and using a mixture of different KA techniques with different strengths and weaknesses, can help developers acquire NLG knowledge that is mostly correct.”
natural-language-processing artificial-intelligence interesting-problems high-hanging-fruit machine-learning nudge-targets- “The present article is an invited contribution to the Encyclopedia of Complexity and System Science, Robert A. Meyers Ed., Springer New York (2009). It is a review of the biophysical mechanisms that underly cell motility.…”
biophysics biology review i-used-to-do-this-stuff lovely [1110.0671] Width Distributions for Convex Regular Polyhedra
“The mean width is a measure on three-dimensional convex bodies that enjoys equal status with volume and surface area [Rota]. As the phrase suggests, it is the mean of a probability density f. We verify formulas for mean widths of the regular tetrahedron and the cube. Higher-order moments of f_tetra and f_cube have not been examined until now. Assume that each polyhedron has edges of unit length. We deduce that the mean square width of the regular tetrahedron is 1/3+(3+sqrt(3))/(3*pi) and the mean square width of the cube is 1+4/pi.”
geometry mathematical-recreations nudge-targets[cs/0305036] Using Dynamic Simulation in the Development of Construction Machinery
“As in the car industry for quite some time, dynamic simulation of complete vehicles is being practiced more and more in the development of off-road machinery. However, specific questions arise due not only to company structure and size, but especially to the type of product. Tightly coupled, non-linear subsystems of different domains make prediction and optimisation of the complete system’s dynamic behaviour a challenge. Furthermore, the demand for versatile machines leads to sometimes contradictory target requirements and can turn the design process into a hunt for the least painful compromise. This can be avoided by profound system knowledge, assisted by simulation-driven product development. This paper gives an overview of joint research into this issue by Volvo Wheel Loaders and Linkoping University on that matter, lists the results of a related literature review and introduces the term “operateability”. Rather than giving detailed answers, the problem space for ongoing and future research is examined and possible solutions are sketched.”
engineering-design design-automation modeling dynamical-systems manufacturing nudge-targets