Items of some interest…

These are my recent Pinboard.in links:

  • [1108.4135] Complex-Valued Autoencoders

    "Autoencoders are unsupervised machine learning circuits whose learning goal is to minimize a distortion measure between inputs and outputs. Linear autoencoders can be defined over any field and only real-valued linear autoencoder have been studied so far. Here we study complex-valued linear autoencoders where the components of the training vectors and adjustable matrices are defined over the complex field with the $L_2$ norm. We provide simpler and more general proofs that unify the real-valued and complex-valued cases, showing that in both cases the landscape of the error function is invariant under certain groups of transformations. The landscape has no local minima, a family of global minima associated with Principal Component Analysis, and many families of saddle points associated with orthogonal projections onto sub-space spanned by sub-optimal subsets of eigenvectors of the covariance matrix. The theory yields several iterative, convergent, learning algorithms, a clear understanding of the generalization properties of the trained autoencoders, and can equally be applied to the hetero-associative case when external targets are provided. Partial results on deep architecture as well as the differential geometry of autoencoders are also presented. The general framework described here is useful to classify autoencoders and identify general common properties that ought to be investigated for each class, illuminating some of the connections between information theory, unsupervised learning, clustering, Hebbian learning, and auto encoders."

    neural-networks machine-learning classification encoding algorithms nudge-targets

  • [1108.5685] Predicting flow reversals in chaotic natural convection using data assimilation

    "A simplified model of natural convection, similar to the Lorenz (1963) system, is compared to computational fluid dynamics simulations in order to test data assimilation methods and better understand the dynamics of convection. The thermosyphon is represented by a long time flow simulation, which serves as a reference "truth". Forecasts are then made using the Lorenz-like model and synchronized to noisy and limited observations of the truth using data assimilation. The resulting analysis is observed to infer dynamics absent from the model when using short assimilation windows.

    Furthermore, chaotic flow reversal occurrence and residency times in each rotational state are forecast using analysis data. Flow reversals have been successfully forecast in the related Lorenz system, as part of a perfect model experiment, but never in the presence of significant model error or unobserved variables. Finally, we provide new details concerning the fluid dynamical processes present in the thermosyphon during these flow reversals."

    chaos dynamical-systems experiment prediction numerical-methods algorithms nudge-targets

  • [1108.1320] Compressed Matrix Multiplication

    "Motivated by the problems of computing sample covariance matrices, and of transforming a collection of vectors to a basis where they are sparse, we present a simple algorithm that computes an approximation of the product of two n-by-n real matrices A and B.…"

    approximation algorithms nudge-targets

  • [1110.5296] Computing a Longest Common Palindromic Subsequence

    "The {em longest common subsequence (LCS)} problem is a classic and well-studied problem in computer science. Palindrome is a word which reads the same forward as it does backward. The {em longest common palindromic subsequence (LCPS)} problem is an interesting variant of the classic LCS problem which finds the longest common subsequence between two given strings such that the computed subsequence is also a palindrome. In this paper, we study the LCPS problem and give efficient algorithms to solve this problem. To the best of our knowledge, this is the first attempt to study and solve this interesting problem."

    combinatorics strings algorithms nudge-targets

Items of some interest…

These are my recent Pinboard.in links:

  • [1110.0585] Discriminately Decreasing Discriminability with Learned Image Filters

    "In machine learning and computer vision, input images are often filtered to increase data discriminability. In some situations, however, one may wish to purposely decrease discriminability of one classification task (a "distractor" task), while simultaneously preserving information relevant to another (the task-of-interest): For example, it may be important to mask the identity of persons contained in face images before submitting them to a crowdsourcing site (e.g., Mechanical Turk) when labeling them for certain facial attributes. Another example is inter-dataset generalization: when training on a dataset with a particular covariance structure among multiple attributes, it may be useful to suppress one attribute while preserving another so that a trained classifier does not learn spurious correlations between attributes. In this paper we present an algorithm that finds optimal filters to give high discriminability to one task while simultaneously giving low discriminability to a distractor task. We present results showing the effectiveness of the proposed technique on both simulated data and natural face images."

    machine-learning data-preparation filtering algorithms nudge-targets

  • [1110.5183] Diffusion of Information in Robot Swarms

    "This work is devoted to communication approaches, which spread information in robot swarms. These mechanisms are useful for large-scale systems and also for such cases when a limited communication equipment does not allow routing of information packages. We focus on two approaches such as virtual fields and epidemic algorithms, discuss several aspects of hardware implementation and demonstrate experiments performed with microrobots "Jasmine"."

    agent-based swarms communication complex-systems epidemiology dynamical-systems experiment

  • [1109.5229] Distributed Algorithms for Optimal Power Flow Problem

    "Optimal power flow (OPF) is an important problem for power generation and it is in general non-convex. With the employment of renewable energy, it will be desirable if OPF can be solved very efficiently so its solution can be used in real time. With some special network structure, e.g. trees, the problem has been shown to have a zero duality gap and the convex dual problem yields the optimal solution. In this paper, we propose a primal and a dual algorithm to coordinate the smaller subproblems decomposed from the convexified OPF. We can arrange the subproblems to be solved sequentially and cumulatively in a central node or solved in parallel in distributed nodes. We test the algorithms on IEEE radial distribution test feeders, some random tree-structured networks, and the IEEE transmission system benchmarks. Simulation results show that the computation time can be improved dramatically with our algorithms over the centralized approach of solving the problem without decomposition, especially in tree-structured problems. The computation time grows linearly with the problem size with the cumulative approach while the distributed one can have size-independent computation time."

    operations-research algorithms network-theory infrastructure composition nudge-targets

  • Coupon collector’s problem – Wikipedia, the free encyclopedia

    "In probability theory, the coupon collector's problem describes the "collect all coupons and win" contests. It asks the following question: Suppose that there are n coupons, from which coupons are being collected with replacement. What is the probability that more than t sample trials are needed to collect all n coupons? An alternative statement is: Given n coupons, how many coupons do you expect you need to draw with replacement before having drawn each coupon at least once. The mathematical analysis of the problem reveals that the expected number of trials needed grows as Θ(nlog(n)). For example, when n = 50 it takes about 225 trials to collect all 50 coupons."

    probability-theory slic-representation nudge-targets

  • [1107.2379] Data Stability in Clustering: A Closer Look

    "This paper considers the model introduced by Bilu and Linial (2010), who study problems for which the optimal clustering does not change when the distances are perturbed by multiplicative factors. They show that even when a problem is NP-hard, it is sometimes possible to obtain polynomial-time algorithms for instances resilient to large perturbations, e.g. on the order of $O(sqrt{n})$ for max-cut clustering. Awasthi et al. (2010) extend this line of work by considering center-based objectives, and Balcan and Liang (2011) consider the $k$-median and min-sum objectives, giving efficient algorithms for instances resilient to certain constant multiplicative perturbations.

    Here, we are motivated by the question of to what extent these assumptions can be relaxed while allowing for efficient algorithms. We show there is little room to improve these results by giving NP-hardness lower bounds for both the $k$-median and min-sum objectives. On the other hand, we show that multiplicative resilience parameters, even only on the order of $Theta(1)$, can be so strong as to make the clustering problem trivial, and we exploit these assumptions to present a simple one pass streaming algorithm for the $k$-median objective. We also consider a model of additive perturbations and give a correspondence between additive and multiplicative notions of stability. Our results provide a close examination of the consequences of assuming, even constant, stability in data."

    clustering statistics algorithms robustness nudge-targets

  • [1109.5641] Energy-Aware Load Balancing in Content Delivery Networks

    "Internet-scale distributed systems such as content delivery networks (CDNs) operate hundreds of thousands of servers deployed in thousands of data center locations around the globe. Since the energy costs of operating such a large IT infrastructure are a significant fraction of the total operating costs, we argue for redesigning CDNs to incorporate energy optimizations as a first-order principle. We propose techniques to turn off CDN servers during periods of low load while seeking to balance three key design goals: maximize energy reduction, minimize the impact on client-perceived service availability (SLAs), and limit the frequency of on-off server transitions to reduce wear-and-tear and its impact on hardware reliability. We propose an optimal offline algorithm and an online algorithm to extract energy savings both at the level of local load balancing within a data center and global load balancing across data centers. We evaluate our algorithms using real production workload traces from a large commercial CDN. Our results show that it is possible to reduce the energy consumption of a CDN by more than 55% while ensuring a high level of availability that meets customer SLA requirements and incurring an average of one on-off transition per server per day. Further, we show that keeping even 10% of the servers as hot spares helps absorb load spikes due to global flash crowds with little impact on availability SLAs. Finally, we show that redistributing load across proximal data centers can enhance service availability significantly, but has only a modest impact on energy savings."

    algorithms load-balancing infrastructure nudge-targets operations-research

  • [1110.0264] Face Recognition using Optimal Representation Ensemble

    "Recently, the face recognizers based on linear representations have been shown to deliver state-of-the-art performance. In real-world applications, however, face images usually suffer from expressions, disguises and random occlusions. The problematic facial parts undermine the validity of the linear-subspace assumption and thus the recognition performance deteriorates significantly. In this work, we address the problem in a learning-inference-mixed fashion. By observing that the linear-subspace assumption is more reliable on certain face patches rather than on the holistic face, some Bayesian Patch Representations (BPRs) are randomly generated and interpreted according to the Bayes' theory. We then train an ensemble model over the patch-representations by minimizing the empirical risk w.r.t the "leave-one-out margins". The obtained model is termed Optimal Representation Ensemble (ORE), since it guarantees the optimality from the perspective of Empirical Risk Minimization. To handle the unknown patterns in test faces, a robust version of BPR is proposed by taking the non-face category into consideration. Equipped with the Robust-BPRs, the inference ability of ORE is increased dramatically and several record-breaking accuracies (99.9% on Yale-B and 99.5% on AR) and desirable efficiencies (below 20 ms per face in Matlab) are achieved. It also overwhelms other modular heuristics on the faces with random occlusions, extreme expressions and disguises. Furthermore, to accommodate immense BPRs sets, a boosting-like algorithm is also derived. The boosted model, a.k.a Boosted-ORE, obtains similar performance to its prototype. Besides the empirical superiorities, two desirable features of the proposed methods, namely, the training-determined model-selection and the data-weight-free boosting procedure, are also theoretically verified."

    image-analysis face-recognition algorithms nudge-targets

  • [1110.0957] Dictionary Learning for Deblurring and Digital Zoom

    "This paper proposes a novel approach to image deblurring and digital zooming using sparse local models of image appearance. These models, where small image patches are represented as linear combinations of a few elements drawn from some large set (dictionary) of candidates, have proven well adapted to several image restoration tasks. A key to their success has been to learn dictionaries adapted to the reconstruction of small image patches. In contrast, recent works have proposed instead to learn dictionaries which are not only adapted to data reconstruction, but also tuned for a specific task. We introduce here such an approach to deblurring and digital zoom, using pairs of blurry/sharp (or low-/high-resolution) images for training, as well as an effective stochastic gradient algorithm for solving the corresponding optimization task. Although this learning problem is not convex, once the dictionaries have been learned, the sharp/high-resolution image can be recovered via convex optimization at test time. Experiments with synthetic and real data demonstrate the effectiveness of the proposed approach, leading to state-of-the-art performance for non-blind image deblurring and digital zoom."

    image-processing image-analysis algorithms nudge-targets hyperresolution

  • [1110.0477] Distributed Evolutionary Graph Partitioning

    "We present a novel distributed evolutionary algorithm, KaFFPaE, to solve the Graph Partitioning Problem, which makes use of KaFFPa (Karlsruhe Fast Flow Partitioner). The use of our multilevel graph partitioner KaFFPa provides new effective crossover and mutation operators. By combining these with a scalable communication protocol we obtain a system that is able to improve the best known partitioning results for many inputs in a very short amount of time. For example, in Walshaw's well known benchmark tables we are able to improve or recompute 76% of entries for the tables with 1%, 3% and 5% imbalance."

    algorithms graph-theory evolutionary-algorithms nudge-targets

  • [1104.1605] Efficient Top-K Retrieval in Online Social Tagging Networks

    "We consider in this paper top-k query answering in social tagging systems, also known as folksonomies. This problem requires a significant departure from existing, socially agnostic techniques. In a network-aware context, one can (and should) exploit the social links, which can indicate how users relate to the seeker and how much weight their tagging actions should have in the result build-up. We propose an algorithm that has the potential to scale to current applications. While the problem has already been considered in previous literature, this was done either under strong simplifying assumptions or under choices that cannot scale to even moderate-size real world applications. We first consider a key aspect of the problem, which is accessing the closest or most relevant users for a given seeker. We describe how this can be done on the fly (without any pre-computations) for several possible choices – arguably the most natural ones – of proximity computation in a user network. Based on this, our top-k algorithm is sound and complete, while addressing the scalability issues of the existing ones. Importantly, our technique is instance optimal in the case when the search relies exclusively on the social weight of tagging actions. To further reduce response times, we then consider directions for efficiency by approximation. Extensive experiments on real world data show that our techniques can drastically improve the response time, without sacrificing precision."

    folksonomy tagging network-theory search-algorithms nudge-targets data-access

Items of some interest…

These are my recent Pinboard.in links:

  • [1110.1462] Dynamic Clustering of Histogram Data Based on Adaptive Squared Wasserstein Distances

    "…To cluster sets of histogram data, we propose to use Dynamic Clustering Algorithm, (based on adaptive squared Wasserstein distances) that is a k-means-like algorithm for clustering a set of individuals into K classes that are apriori fixed. The main aim of this research is to provide a tool for clustering histograms, emphasizing the different contributions of the histogram variables, and their components, to the definition of the clusters. We demonstrate that this can be achieved using adaptive distances.

    Two kind of adaptive distances are considered: the first takes into account the variability of each component of each descriptor for the whole set of individuals; the second takes into account the variability of each component of each descriptor in each cluster. We furnish interpretative tools of the obtained partition based on an extension of the classical measures (indexes) to the use of adaptive distances in the clustering criterion function. Applications on synthetic and real-world data corroborate the proposed procedure."

    classification statistics histograms metrics clustering

  • [1110.1412] Quantifying loopy network architectures

    "Biology presents many examples of planar distribution and structural networks having dense sets of closed loops. An archetype of this form of network organization is the vasculature of dicotyledonous leaves, which showcases a hierarchically-nested architecture containing closed loops at many different levels. Although a number of methods have been proposed to measure aspects of the structure of such networks, a robust metric to quantify their hierarchical organization is still lacking. We present an algorithmic framework, the hierarchical loop decomposition, that allows mapping loopy networks to binary trees, preserving in the connectivity of the trees the architecture of the original graph. We apply this framework to investigate computer generated graphs, such as artificial models and optimal distribution networks, as well as natural graphs extracted from digitized images of dicotyledonous leaves and vasculature of rat cerebral neocortex. We calculate various metrics based on the Asymmetry, the cumulative size distribution and the Strahler bifurcation ratios of the corresponding trees and discuss the relationship of these quantities to the architectural organization of the original graphs. This algorithmic framework decouples the geometric information (exact location of edges and nodes) from the metric topology (connectivity and edge weight) and it ultimately allows us to perform a quantitative statistical comparison between predictions of theoretical models and naturally occurring loopy graphs."

    complexology biophysics network-theory metrics

  • [1110.1393] High-Precision Tuning of State for Memristive Devices by Adaptable Variation-Tolerant Algorithm

    "Using memristive properties common for the titanium dioxide thin film devices, we designed a simple write algorithm to tune device conductance at a specific bias point to 1% relative accuracy (which is roughly equivalent to 7-bit precision) within its dynamic range even in the presence of large variations in switching behavior. The high precision state is nonvolatile and the results are likely to be sustained for nanoscale memristive devices because of the inherent filamentary nature of the resistive switching. The proposed functionality of memristive devices is especially attractive for analog computing with low precision data. As one representative example we demonstrate hybrid circuitry consisting of CMOS summing amplifier and two memristive devices to perform analog multiply and accumulate computation, which is a typical bottleneck operation in information processing."

    memristors engineering-design simulation control-systems nudge-targets

  • [1110.1521] Nodal domains of a non-separable problem – the right angled isosceles triangle

    "Our result may be generalized to other domains where similar algorithms may apply. Our algorithm is based on the fact that the eigenfunctions are presented as a linear combination of simple plane waves. It is therefore tempting to try and generalize it for other drums with similar property. The equilateral triangle is an immediate candidate (see [29] and references within).

    A further, and quite surprising, result is the recursive formula for the number of nodal loops. To our knowledge this is the first known exact formula for the nodal count of a non-separable planar manifold (for certain eigenfunctions of tori exact formulas have been given in [22]). The formula was found by direct inspection of large tables and has been verified for a large bulk of data computationally. An obvious challenge is to prove this formula. In particular, the recursive part of the formula resembles the famous Euclid algorithm for the greatest common divisor. A further investigation of the mentioned formula might therefore expose some new number theoretical properties of the nodal count."

    physics algorithms analytical-results open-questions geometry acoustics exact-form nudge-targets

  • [1110.1485] A Face Recognition Scheme using Wavelet Based Dominant Features

    "In this paper, a multi-resolution feature extraction algorithm for face recognition is proposed based on two-dimensional discrete wavelet transform (2D-DWT), which efficiently exploits the local spatial variations in a face image. For the purpose of feature extraction, instead of considering the entire face image, an entropy-based local band selection criterion is developed, which selects high-informative horizontal segments from the face image. In order to capture the local spatial variations within these highinformative horizontal bands precisely, the horizontal band is segmented into several small spatial modules. Dominant wavelet coefficients corresponding to each local region residing inside those horizontal bands are selected as features. In the selection of the dominant coefficients, a threshold criterion is proposed, which not only drastically reduces the feature dimension but also provides high within-class compactness and high between-class separability. A principal component analysis is performed to further reduce the dimensionality of the feature space. Extensive experimentation is carried out upon standard face databases and a very high degree of recognition accuracy is achieved by the proposed method in comparison to those obtained by some of the existing methods."

    face-recognition algorithms image-processing wavelets nudge-targets

  • [1110.1553] Hierarchical QR factorization algorithms for multi-core cluster systems

    "This paper describes a new QR factorization algorithm which is especially designed for massively parallel platforms combining parallel distributed multi-core nodes. These platforms make the present and the foreseeable future of high-performance computing. Our new QR factorization algorithm falls in the category of the tile algorithms which naturally enables good data locality for the sequential kernels executed by the cores (high sequential performance), low number of messages in a parallel distributed setting (small latency term), and fine granularity (high parallelism)."

    parallel-computing operations-research factorization algorithms nudge-targets meta-algorithms

  • [1110.1560] On the Coloring of Grid Wireless Sensor Networks: the Vector-Based Coloring Method

    "Graph coloring is used in wireless networks to optimize network resources: bandwidth and energy. Nodes access the medium according to their color. It is the responsibility of the coloring algorithm to ensure that interfering nodes do not have the same color. In this research report, we focus on wireless sensor networks with grid topologies. How does a coloring algorithm take advantage of the regularity of grid topology to provide an optimal periodic coloring, that is a coloring with the minimum number of colors? We propose the Vector-Based Coloring Method, denoted VCM, a new method that is able to provide an optimal periodic coloring for any radio transmission range and for any h-hop coloring, h>=1. This method consists in determining at which grid nodes a color can be reproduced without creating interferences between these nodes while minimizing the number of colors used. We compare the number of colors provided by VCM with the number of colors obtained by a distributed coloring algorithm with line and column priority assignments. We also provide bounds on the number of colors of optimal general colorings of the infinite grid, and show that periodic colorings (and thus VCM) are asymptotically optimal. Finally, we discuss the applicability of this method to a real wireless network."

    graph-theory algorithms operations-research nudge-targets

  • [1110.1580] A Polylogarithmic-Competitive Algorithm for the k-Server Problem

    "We give the first polylogarithmic-competitive randomized online algorithm for the $k$-server problem on an arbitrary finite metric space. In particular, our algorithm achieves a competitive ratio of O(log^3 n log^2 k log log n) for any metric space on n points. Our algorithm improves upon the deterministic (2k-1)-competitive algorithm of Koutsoupias and Papadimitriou [J.ACM'95] whenever n is sub-exponential in k."

    scheduling operations-research algorithms nudge-targets

  • [1110.1590] PSA: The Packet Scheduling Algorithm for Wireless Sensor Networks

    "The main cause of wasted energy consumption in wireless sensor networks is packet collision. The packet scheduling algorithm is therefore introduced to solve this problem. Some packet scheduling algorithms can also influence and delay the data transmitting in the real-time wireless sensor networks. This paper presents the packet scheduling algorithm (PSA) in order to reduce the packet congestion in MAC layer leading to reduce the overall of packet collision in the system The PSA is compared with the simple CSMA/CA and other approaches using network topology benchmarks in mathematical method. The performances of our PSA are better than the standard (CSMA/CA). The PSA produces better throughput than other algorithms. On other hand, the average delay of PSA is higher than previous works. However, the PSA utilizes the channel better than all algorithms."

    sensor-networks distributed-processing scheduling routing operations-research algorithms nudge-targets

  • [1110.0725] A Survey of Distributed Data Aggregation Algorithms

    "Distributed data aggregation has been an active field of research in the last decade, and a huge diverse amount of techniques can be found in the literature. For this reasons, this survey intends to be an important time saving instrument, for those that desire to get a quick and comprehensive overview of the state of the art on distributed data aggregation. Moreover, by carefully highlighting the strength and limitations of the more pertinent approaches, this study can provide a useful assistance to help readers choose which technique to apply in specific settings.

    Currently, there is no ideal general solution to the distributed computation of an aggregation function, all existing techniques have its pitfalls (some more than others). Therefore, more research in this field will be expected in the next few years. In particular, due to the added value of computing complex aggregates, new algorithms might arise to estimate the statistical distribution of values, as the few existing approaches exhibit some limitations in terms of accuracy and resource consumption. Additional research efforts should be made to improve the support to churn, message loss, and continuous estimation of mutable input values."

    statistics reviews distributed-processing communication coordination nudge-targets

  • [0911.3482] Complexity of Networks (reprise)

    "Network or graph structures are ubiquitous in the study of complex systems. Often, we are interested in complexity trends of these system as it evolves under some dynamic. An example might be looking at the complexity of a food web as species enter an ecosystem via migration or speciation, and leave via extinction.

    In a previous paper, a complexity measure of networks was proposed based on the {em complexity is information content} paradigm. To apply this paradigm to any object, one must fix two things: a representation language, in which strings of symbols from some alphabet describe, or stand for the objects being considered; and a means of determining when two such descriptions refer to the same object. With these two things set, the information content of an object can be computed in principle from the number of equivalent descriptions describing a particular object.

    The previously proposed representation language had the deficiency that the fully connected and empty networks were the most complex for a given number of nodes. A variation of this measure, called zcomplexity, applied a compression algorithm to the resulting bitstring representation, to solve this problem. Unfortunately, zcomplexity proved too computationally expensive to be practical.
    In this paper, I propose a new representation language that encodes the number of links along with the number of nodes and a representation of the linklist. This, like zcomplexity, exhibits minimal complexity for fully connected and empty networks, but is as tractable as the original measure."

    network-theory complexology complex-systems measurement perform structure-function-relations discrete-mathematics

  • [1108.4279] Detection and emergence

    "Two different conceptions of emergence are reconciled as two instances of the phenomenon of detection. In the process of comparing these two conceptions, we find that the notions of complexity and detection allow us to form a unified definition of emergence that clearly delineates the role of the observer."

    complexology emergence pragmatism-it-ain't but-soon

  • [1001.4278] Weight Optimization for Distributed Average Consensus Algorithm in Symmetric, CCS & KCS Star Networks

    "This paper addresses weight optimization problem in distributed consensus averaging algorithm over networks with symmetric star topology. We have determined optimal weights and convergence rate of the network in terms of its topological parameters. In addition, two alternative topologies with more rapid convergence rates have been introduced. The new topologies are Complete-Cored Symmetric (CCS) star and K-Cored Symmetric (KCS) star topologies. It has been shown that the optimal weights for the edges of central part in symmetric and CCS star configurations are independent of their branches. By simulation optimality of obtained weights under quantization constraints have been verified."

    operations-research decision-making network-theory nudge-targets

  • [1109.5389] Water drives peptide conformational transitions

    "Transitions between metastable conformations of a dipeptide are investigated using classical molecular dynamics simulation with explicit water molecules. The distribution of the surrounding water at different moments before the transitions and the dynamical correlations of water with the peptide's configurational motions indicate that water is the main driving force of the conformational changes."

    molecular-design systems-biology simulation intracellular-dynamics kinda-knew-this-a-long-time-ago biochemistry

  • [1105.1445] Vehicular traffic flow at an intersection with the possibility of turning

    "We have developed a Nagel-Schreckenberg cellular automata model for describing of vehicular traffic flow at a single intersection. A set of traffic lights operating in fixed-time scheme controls the traffic flow. Open boundary condition is applied to the streets each of which conduct a uni-directional flow. Streets are single-lane and cars can turn upon reaching to the intersection with prescribed probabilities. Extensive Monte Carlo simulations are carried out to find the model flow characteristics. In particular, we investigate the flows dependence on the signalisation parameters, turning probabilities and input rates. It is shown that for each set of parameters, there exist a plateau region inside which the total outflow from the intersection remains almost constant. We also compute total waiting time of vehicles per cycle behind red lights for various control parameters."

    cellular-automata complexology traffic-models agent-based simulation nudge-substrates

  • [1110.1391] A Comparison of Different Machine Transliteration Models

    "Machine transliteration is a method for automatically converting words in one language into phonetically equivalent ones in another language. Machine transliteration plays an important role in natural language applications such as information retrieval and machine translation, especially for handling proper nouns and technical terms. Four machine transliteration models — grapheme-based transliteration model, phoneme-based transliteration model, hybrid transliteration model, and correspondence-based transliteration model — have been proposed by several researchers. To date, however, there has been little research on a framework in which multiple transliteration models can operate simultaneously. Furthermore, there has been no comparison of the four models within the same framework and using the same data. We addressed these problems by 1) modeling the four models within the same framework, 2) comparing them under the same conditions, and 3) developing a way to improve machine transliteration through this comparison. Our comparison showed that the hybrid and correspondence-based models were the most effective and that the four models can be used in a complementary manner to improve machine transliteration performance."

    natural-language-processing machine-learning review nudge-targets

  • [1108.5508] A Pattern Measure

    "In this paper we propose numerical measures for evaluating the aesthetic interest of simple patterns. The patterns consist of elements (symbols, pixels, etc.) in regular square arrays. The measures depend on two characteristics of the patterns: the number of different types of element, and the number of symmetries in their arrangement. We define two complementary composite measures L and C for the degree of pattern in a design, and compute them here for 2×2 and 6×6 arrays. The results distinguish simple from high-variation cases. We suspect that the measure L corresponds to the degree that human beings intuitively feel a design to be "interesting", so this model would aid in quantifying the visual connection of two- dimensional designs with viewers. The other composite measure C based on these numerical properties characterizes the extent of randomness of an array. Combining symbol variety with symmetry calculations allows us to employ hierarchical scaling to count the relative impact of different levels of scale. By identifying substructures we can distinguish between organized patterns and disorganized complexity. The measures described here are related to verbal descriptors derived from work by psychologists on responses to visual environments."

    cognition aesthetics experimental-psychology nudge-targets learning-by-watching

  • [1106.5264] Acquiring Correct Knowledge for Natural Language Generation

    "Natural language generation (NLG) systems are computer software systems that produce texts in English and other human languages, often from non-linguistic input data. NLG systems, like most AI systems, need substantial amounts of knowledge. However, our experience in two NLG projects suggests that it is difficult to acquire correct knowledge for NLG systems; indeed, every knowledge acquisition (KA) technique we tried had significant problems. In general terms, these problems were due to the complexity, novelty, and poorly understood nature of the tasks our systems attempted, and were worsened by the fact that people write so differently. This meant in particular that corpus-based KA approaches suffered because it was impossible to assemble a sizable corpus of high-quality consistent manually written texts in our domains; and structured expert-oriented KA techniques suffered because experts disagreed and because we could not get enough information about special and unusual cases to build robust systems. We believe that such problems are likely to affect many other NLG systems as well. In the long term, we hope that new KA techniques may emerge to help NLG system builders. In the shorter term, we believe that understanding how individual KA techniques can fail, and using a mixture of different KA techniques with different strengths and weaknesses, can help developers acquire NLG knowledge that is mostly correct."

    natural-language-processing artificial-intelligence interesting-problems high-hanging-fruit machine-learning nudge-targets

  • [1105.2423] Cytoskeleton and Cell Motility

    "The present article is an invited contribution to the Encyclopedia of Complexity and System Science, Robert A. Meyers Ed., Springer New York (2009). It is a review of the biophysical mechanisms that underly cell motility.…"

    biophysics biology review i-used-to-do-this-stuff lovely

  • [1110.0671] Width Distributions for Convex Regular Polyhedra

    "The mean width is a measure on three-dimensional convex bodies that enjoys equal status with volume and surface area [Rota]. As the phrase suggests, it is the mean of a probability density f. We verify formulas for mean widths of the regular tetrahedron and the cube. Higher-order moments of f_tetra and f_cube have not been examined until now. Assume that each polyhedron has edges of unit length. We deduce that the mean square width of the regular tetrahedron is 1/3+(3+sqrt(3))/(3*pi) and the mean square width of the cube is 1+4/pi."

    geometry mathematical-recreations nudge-targets

  • [cs/0305036] Using Dynamic Simulation in the Development of Construction Machinery

    "As in the car industry for quite some time, dynamic simulation of complete vehicles is being practiced more and more in the development of off-road machinery. However, specific questions arise due not only to company structure and size, but especially to the type of product. Tightly coupled, non-linear subsystems of different domains make prediction and optimisation of the complete system's dynamic behaviour a challenge. Furthermore, the demand for versatile machines leads to sometimes contradictory target requirements and can turn the design process into a hunt for the least painful compromise. This can be avoided by profound system knowledge, assisted by simulation-driven product development. This paper gives an overview of joint research into this issue by Volvo Wheel Loaders and Linkoping University on that matter, lists the results of a related literature review and introduces the term "operateability". Rather than giving detailed answers, the problem space for ongoing and future research is examined and possible solutions are sketched."

    engineering-design design-automation modeling dynamical-systems manufacturing nudge-targets