These are my recent Pinboard.in links:
Monthly Archives: December 2011
Items of some interest…
These are my recent Pinboard.in links:
-
[1108.4135] Complex-Valued Autoencoders
"Autoencoders are unsupervised machine learning circuits whose learning goal is to minimize a distortion measure between inputs and outputs. Linear autoencoders can be defined over any field and only real-valued linear autoencoder have been studied so far. Here we study complex-valued linear autoencoders where the components of the training vectors and adjustable matrices are defined over the complex field with the $L_2$ norm. We provide simpler and more general proofs that unify the real-valued and complex-valued cases, showing that in both cases the landscape of the error function is invariant under certain groups of transformations. The landscape has no local minima, a family of global minima associated with Principal Component Analysis, and many families of saddle points associated with orthogonal projections onto sub-space spanned by sub-optimal subsets of eigenvectors of the covariance matrix. The theory yields several iterative, convergent, learning algorithms, a clear understanding of the generalization properties of the trained autoencoders, and can equally be applied to the hetero-associative case when external targets are provided. Partial results on deep architecture as well as the differential geometry of autoencoders are also presented. The general framework described here is useful to classify autoencoders and identify general common properties that ought to be investigated for each class, illuminating some of the connections between information theory, unsupervised learning, clustering, Hebbian learning, and auto encoders."
neural-networks machine-learning classification encoding algorithms nudge-targets
-
[1108.5685] Predicting flow reversals in chaotic natural convection using data assimilation
"A simplified model of natural convection, similar to the Lorenz (1963) system, is compared to computational fluid dynamics simulations in order to test data assimilation methods and better understand the dynamics of convection. The thermosyphon is represented by a long time flow simulation, which serves as a reference "truth". Forecasts are then made using the Lorenz-like model and synchronized to noisy and limited observations of the truth using data assimilation. The resulting analysis is observed to infer dynamics absent from the model when using short assimilation windows.
Furthermore, chaotic flow reversal occurrence and residency times in each rotational state are forecast using analysis data. Flow reversals have been successfully forecast in the related Lorenz system, as part of a perfect model experiment, but never in the presence of significant model error or unobserved variables. Finally, we provide new details concerning the fluid dynamical processes present in the thermosyphon during these flow reversals."
chaos dynamical-systems experiment prediction numerical-methods algorithms nudge-targets
-
[1108.1320] Compressed Matrix Multiplication
"Motivated by the problems of computing sample covariance matrices, and of transforming a collection of vectors to a basis where they are sparse, we present a simple algorithm that computes an approximation of the product of two n-by-n real matrices A and B.…"
-
[1110.5296] Computing a Longest Common Palindromic Subsequence
"The {em longest common subsequence (LCS)} problem is a classic and well-studied problem in computer science. Palindrome is a word which reads the same forward as it does backward. The {em longest common palindromic subsequence (LCPS)} problem is an interesting variant of the classic LCS problem which finds the longest common subsequence between two given strings such that the computed subsequence is also a palindrome. In this paper, we study the LCPS problem and give efficient algorithms to solve this problem. To the best of our knowledge, this is the first attempt to study and solve this interesting problem."
Items of some interest…
These are my recent Pinboard.in links:
-
[1109.5664] Deterministic Feature Selection for $k$-means Clustering
"We study feature selection for $k$-means clustering. Although the literature contains many methods with good empirical performance, algorithms with provable theoretical behavior have only recently been developed. Unfortunately, these algorithms are randomized and fail with, say, a constant probability. We address this issue by presenting a emph{deterministic} feature selection algorithm for $k$-means with theoretical guarantees. At the heart of our algorithm lies a deterministic method for decompositions of the identity."
-
[1110.5190] Constant-factor approximation of domination number in sparse graphs
"The k-domination number of a graph is the minimum size of a set X such that every vertex of G is in distance at most k from X. We give a linear time constant-factor approximation algorithm for k-domination number in classes of graphs with bounded expansion, which include e.g. proper minor-closed graph classes, classes closed on topological minors or classes of graphs that can be drawn on a fixed surface with bounded number of crossings on each edge.
The algorithm is based on the following approximate min-max characterization. A subset A of vertices of a graph G is d-independent if the distance between each pair of vertices in A is greater than d. Note that the size of the largest 2k-independent set is a lower bound for the k-domination number. We show that every graph from a fixed class with bounded expansion contains a 2k-independent set A and a k-dominating set D such that |D|=O(|A|), and these sets can be found in linear time. For domination number (k=1) the assumptions can be relaxed, and the result holds for all graph classes with arrangeability bounded by a constant."
operations-research combinatorics graph-theory algorithms nudge-targets
-
[1112.1945] Approximation Algorithms for Edge Partitioned Vertex Cover Problems
"In the Partial Vertex Cover (PVC) problem we are given an undirected graph G = (V, E), a positive cost associated with each vertex and a positive integer k and the goal is to find a minimum cost subset of vertices S such that atleast k edges of the graph are covered. In this paper we consider two new generalization of the PVC problem. In the first variation which we call Partition Vertex Cover (Partition-VC) problem, the edges of the graph G are divided into n disjoint partitions $P_1, P_2… P_n$ and we have to select a minimum cost subset of vertices S such that atleast $k_i$ edges are covered from partition $P_i$. In the second variation which we call Knapsack Partition Vertex Cover (KPVC) problem, in addition to the previous conditions, each edge e has a profit $pi_{e}$ associated with it and we have an added knapsack constraint that the total profit of the covered edges in partition $P_i$ should be atleast $Pi_i$. We give an $O(log n)$ approximation for both the problems using a combination of deterministic rounding and randomized rounding approach that operates on the LP strengthened by adding Knapsack Cover inequalities as proposed by Carr, Fleischer, Leung & Phillips. We also show that these bounds can not be further improved by reducing the set cover problem to the Partition-VC problem in polynomial time. We also give an $O(f)$ approximation for the Partition-VC problem using a primal dual schema where f is the maximum number of edges in any partition."
operations-research graph-theory graph-partitioning linear-programming nudge-targets
-
[1101.3501] Convergence rates of efficient global optimization algorithms
"Efficient global optimization is the problem of minimizing an unknown function f, using as few evaluations f(x) as possible. It can be considered as a continuum-armed bandit problem, with noiseless data and simple regret. Expected improvement is perhaps the most popular method for solving this problem; the algorithm performs well in experiments, but little is known about its theoretical properties. Implementing expected improvement requires a choice of Gaussian process prior, which determines an associated space of functions, its reproducing-kernel Hilbert space (RKHS). When the prior is fixed, expected improvement is known to converge on the minimum of any function in the RKHS. We begin by providing convergence rates for this procedure. The rates are optimal for functions of low smoothness, and we modify the algorithm to attain optimal rates for smoother functions. For practitioners, however, these results are somewhat misleading. Priors are typically not held fixed, but depend on parameters estimated from the data. For standard estimators, we show this procedure may never discover the minimum of f. We then propose alternative estimators, chosen to minimize the constants in the rate of convergence, and show these estimators retain the convergence rates of a fixed prior."
optimization operations-research theory-and-practice-sitting-in-a-tree nudge-targets algorithms
-
[1011.1939] Discrete Partitioning and Coverage Control for Gossiping Robots
"We propose distributed algorithms to automatically deploy a team of mobile robots to partition and provide coverage of a non-convex environment. To handle arbitrary non-convex environments, we represent them as graphs. Our partitioning and coverage algorithm requires only short-range, unreliable pairwise "gossip" communication. The algorithm has two components: (1) a motion protocol to ensure that neighboring robots communicate at least sporadically, and (2) a pairwise partitioning rule to update territory ownership when two robots communicate. By studying an appropriate dynamical system on the space of partitions of the graph vertices, we prove that territory ownership converges to a pairwise-optimal partition in finite time. This new equilibrium set represents improved performance over common Lloyd-type algorithms. Additionally, we detail how our algorithm scales well for large teams in large environments and how the computation can run in anytime with limited resources. Finally, we report on large-scale simulations in complex environments and hardware experiments using the Player/Stage robot control system."
complexology robotics agent-based computational-geometry nudge-targets voronoi emergent-design
-
[1112.1841] Consistency of multidimensional combinatorial substitutions
"Multidimensional combinatorial substitutions are rules that replace symbols by finite patterns of symbols in Z^d. We focus on the case where the patterns are not necessarily rectangular, which requires a specific description of the way they are glued together in the image by a substitution. Two problems can arise when defining a substitution in such a way: it can fail to be consistent, and the patterns in an image by the substitution might overlap.
We prove that it is undecidable whether a two-dimensional substitution is consistent or overlapping, and we provide practical algorithms to decide these properties in some particular cases."
fractals rewriting-systems mathematical-recreations amusing nudge-targets undecodability
-
"Given a set $P$ of $n$ points in the plane, we solve the problems of constructing a geometric planar graph spanning $P$ 1) of minimum degree 2, and 2) which is 2-edge connected, respectively, and has max edge length bounded by a factor of 2 times the optimal; we also show that the factor 2 is best possible given appropriate connectivity conditions on the set $P$, respectively. First, we construct in $O(nlog{n})$ time a geometric planar graph of minimum degree 2 and max edge length bounded by 2 times the optimal. This is then used to construct in $O(nlog n)$ time a 2-edge connected geometric planar graph spanning $P$ with max edge length bounded by $sqrt{5}$ times the optimal, assuming that the set $P$ forms a connected Unit Disk Graph. Second, we prove that 2 times the optimal is always sufficient if the set of points forms a 2 edge connected Unit Disk Graph and give an algorithm that runs in $O(n^2)$ time. We also show that for $k in O(sqrt{n})$, there exists a set $P$ of $n$ points in the plane such that even though the Unit Disk Graph spanning $P$ is $k$-vertex connected, there is no 2-edge connected geometric planar graph spanning $P$ even if the length of its edges is allowed to be up to 17/16."
graph-theory geometry algorithms computational-geometry nudge-targets
Items of some interest…
These are my recent Pinboard.in links:
Items of some interest…
These are my recent Pinboard.in links:
-
BOOKTRYST: Thereby Hangs a Quote, and a New, Must-Read Book on Books
"Trade secrets of medieval book illuminators, the private press movement and Barker's welcome apostasy ("Who the hell reads Kelmscott Press books?"), the degradation of paper quality, the improvement in ink, bookshop merchandizing, the importance of visual detail and symbolism and how the ability to read images has decayed, the importance of the shape of letters as a map of the human mind, Congolese bards, calligraphy, copperplate engraving and the personality of the engraver, Victorian typography, Goudy, Gill, Dwiggins, Morison, the importance of curve, and the current state of "Jine" printing. "
printing typography miscellanies book-review book-culture to-read