links for 2010-08-27
-
"So, let’s get down to the nitty gritty. If consumer debt was $13.8 trillion at the end of 2008 and the banks have since written off 5.66% of that debt, total write-offs were $800 billion. If total consumer debt now sits at $13.5 trillion, then consumers have actually taken on $500 billion of additional debt since the end of 2008. The consumer hasn’t cut back at all. They are still spending and borrowing. It is beyond my comprehension that no one on CNBC or in the other mainstream media can do simple math to figure out that the deleveraging story is just a Big Lie."
links for 2010-08-24
-
"I strongly suspect these patterns are driven mostly by customers, i.e., that more accurate professionals would be less successful in inspiring confidence by others in them. If you are a successful professional, that is probably in part because of your unjustified arrogance."
-
"We all have preconceived ideas about what hummingbirds’ lives are like, but so much of their world is imperceptible to the human eye. Filmmaker Ann Prum describes the breakthrough science and latest technologies that allowed her and the crew to reveal incredible new insights about these aerial athletes."
-
"The remaining charts compare market performance since 2000 with the equivalent elapsed time following the peak in 1929. As the final chart shows, the current real total return over the past decade is worse than the performance over the equivalent timeframe during the Great Depression."
-
"Here is the trailer for Inside Job, Charles Ferguson's upcoming documentary about the financial crisis of 2008. Looks like interesting and well-done stuff."
-
"I think every body who’s breathed the air around eco nom ics gets the the sis that money is an eco nomic prod uct sub ject to sup ply and demand like any other. But to actu ally see it bro ken down as analy sis of dis crete things — a fiat cur rency backed by the full faith and credit of the US gov’t but whose weight and mate ri als and cost and dura bil ity and shape all turn out to be cru cial to its suc cess or fail ure — man, it’s another thing altogether. "
links for 2010-08-20
-
"We understand the dynamics of the world around us as by associating pairs of events, where one event has some influence on the other. These pairs of events can be aggregated into a web of memories representing our understanding of an episode of history. The events and the associations between them need not be directly experienced-they can also be acquired by communication. In this paper we take a network approach to study the dynamics of memories of history. First we investigate the network structure of a data set consisting of reported events by several individuals and how associations connect them. We focus our measurement on degree distributions, degree correlations, cycles (which represent inconsistencies as they would break the time ordering) and community structure.…"
-
"he motility of the worm nematode \textit{Caenorhabditis elegans} is investigated in shallow, wet granular media as a function of particle size dispersity and area density ($\phi$). Surprisingly, we find that the nematode's propulsion speed is enhanced by the presence of particles in a fluid and is nearly independent of area density. The undulation speed, often used to differentiate locomotion gaits, is significantly affected by particle size dispersity for area densities above $\phi \geq 0.55$, and is characterized by a change in the nematode's waveform from swimming to crawling in dense polydisperse media \textit{only}. This change highlights the organism's adaptability to subtle differences in local structure between monodisperse and polydisperse media."
-
"A short survey is provided about our recent explorations of the young topic of noise-based logic. After outlining the motivation behind noise-based computation schemes, we present a short summary of our ongoing efforts in the introduction, development and design of several noise-based deterministic multivalued logic schemes and elements. In particular, we describe classical, instantaneous, continuum, spike and random-telegraph-signal based schemes with applications such as circuits that emulate the brain's functioning and string verification via a slow communication channel."
-
"We consider problems of Bayesian inference for a spatial epidemic on a graph, where the final state of the epidemic corresponds to bond percolation, and where only the set or number of finally infected sites is observed. We develop appropriate Markov chain Monte Carlo algorithms, demonstrating their effectiveness, and we study problems of optimal experimental design. In particular, we demonstrate that for lattice-based processes an experiment on a sparsified lattice can yield more information on model parameters than one conducted on a complete lattice. We also prove some probabilistic results about the behaviour of estimators associated with large infected clusters."
-
"At the most fundamental level, computers are an assembly of gates that are used to perform the basic operations required to execute a program. For problems in the probability domain, even the values used in these most basic operations are not constrained to be either a 0 or a 1. Instead, the basic gates must determine the probability that a bit is a 1, or the probability that it is a 0.
Lyric’s gates are designed to model relationships between probabilities natively in the device physics. For this reason, Lyric can perform mathematical operations in the probability domain with just a handful of transistors – creating power and area savings of more than 10X over traditional implementations." -
"Zipf's law seems to be ubiquitous in human languages and appears to be a universal property of complex communicating systems. Following an early proposal made by Zipf concerning the presence of a tension between the efforts of speaker and hearer in a communication system, we introduce evolution by means of a variational approach to the problem based on Kullback's Minimum Discrimination of Information Principle. Using a formalism fully embedded in the framework of information theory, we demonstrate that Zipf's law is the only expected outcome of an evolving, communicative system under a rigorous definition of the communicative tension described by Zipf."
-
"We engineer an algorithm to solve the approximate dictionary matching problem. Given a list of words $\mathcal{W}$, maximum distance $d$ fixed at preprocessing time and a query word $q$, we would like to retrieve all words from $\mathcal{W}$ that can be transformed into $q$ with $d$ or less edit operations. We present data structures that support fault tolerant queries by generating an index. On top of that, we present a generalization of the method that eases memory consumption and preprocessing time significantly. At the same time, running times of queries are virtually unaffected. We are able to match in lists of hundreds of thousands of words and beyond within microseconds for reasonable distances."
-
"The effects of several nonlinear regularization techniques are discussed in the framework of 3D seismic tomography. Traditional, linear, $\ell_2$ penalties are compared to so-called sparsity promoting $\ell_1$ and $\ell_0$ penalties, and a total variation penalty. Which of these algorithms is judged optimal depends on the specific requirements of the scientific experiment. If the correct reproduction of model amplitudes is important, classical damping towards a smooth model using an $\ell_2$ norm works almost as well as minimizing the total variation but is much more efficient. If gradients (edges of anomalies) should be resolved with a minimum of distortion, we prefer $\ell_1$ damping of Daubechies-4 wavelet coefficients.…"
links for 2010-08-17
-
"Assume that we observe a large number of curves, all of them with identical, although unknown, shape, but with a different random shift. The objective is to estimate the individual time shifts and their distribution. Such an objective appears in several biological applications like neuroscience or ECG signal processing, in which the estimation of the distribution of the elapsed time between repetitive pulses with a possibly low signal-noise ratio, and without a knowledge of the pulse shape is of interest. We suggest an M-estimator leading to a three-stage algorithm: we split our data set in blocks, on which the estimation of the shifts is done by minimizing a cost criterion based on a functional of the periodogram; the estimated shifts are then plugged into a standard density estimator. We show that under mild regularity assumptions the density estimate converges weakly to the true shift distribution. The theory is applied both to simulations and to alignment of real ECG signals.…"
-
"Motivation: Second generation sequencing technology makes it feasible for many researches to obtain enough sequence reads to attempt the de novo assembly of higher eukaryotes (including mammals). De novo assembly not only provides a tool for understanding wide scale biological variation, but within human bio-medicine, it offers a direct way of observing both large scale structural variation and fine scale sequence variation. Unfortunately, improvements in the computational feasibility for de novo assembly have not matched the improvements in the gathering of sequence data. This is for two reasons: the inherent computational complexity of the problem, and the in-practice memory requirements of tools."
-
"Whereas a conventional NAND gate outputs a "1" if neither of its inputs match, the output of a Bayesian NAND gate represents the odds that the two input probabilities match. This makes it possible to perform calculations that use probabilities as their input and output."
links for 2010-08-15
-
"Using frequency distributions of daily closing price sequences of several stock markets, we investigate whether the bias away from an equiprobable sequence distribution, predicted by algorithmic probability, may account for some of the deviation of financial markets from log-normal, and if so for how much of said deviation and over what sequence lengths. Our discussion might constitute a potential starting point for a further investigation of the market as a rule-based system with an 'algorithmic' component, despite its apparent randomness. The use of the theory of algorithmic complexity may supply a set of probing new tools that can be applied to the study of the market price phenomenon. Moreover, the main discussion is cast in terms of assumptions common to areas of economics consistent with an algorithmic view of the market."
-
"To define oscillatory movements of securities market, we put in the non-local extension of Ito- equation for wavelet-images of random processes. It is proposed an algorithm of creation of evolutionary equation and a model of prediction of the most probable price movement path. It is carried out experimental validation of findings."
-
"We show that parametric context-sensitive L-systems with affine geometry interpretation provide a succinct description of some of the most fundamental algorithms of geometric modeling of curves. Examples include the Lane-Riesenfeld algorithm for generating B-splines, the de Casteljau algorithm for generating Bezier curves, and their extensions to rational curves. Our results generalize the previously reported geometric-modeling applications of L-systems, which were limited to subdivision curves."
-
"We revisit the matrix problems sparse null space and matrix sparsification, and show that they are equivalent. We then proceed to seek algorithms for these problems: We prove the hardness of approximation of these problems, and also give a powerful tool to extend algorithms and heuristics for sparse approximation theory to these problems."
-
"We study equilibrium configurations of swarming biological organisms subject to exogenous and pairwise endogenous forces. Beginning with a discrete dynamical model, we derive a variational description of the corresponding continuum population density. Equilibrium solutions are extrema of an energy functional, and satisfy a Fredholm integral equation. We find conditions for the extrema to be local minimizers, global minimizers, and minimizers with respect to infinitesimal Lagrangian displacements of mass. In one spatial dimension, for a variety of exogenous forces, endogenous forces, and domain configurations, we find exact analytical expressions for the equilibria.…"
-
"We survey and show our earlier results about three different ways of fluctuation-enhanced sensing of bio agent, the phage-based method for bacterium detection published earlier; sensing and evaluating the odors of microbes; and spectral and amplitude distribution analysis of noise in light scattering to identify spores based on their diffusion coefficient."
-
"We started by scanning existing social value metrics, such as the ones described in the table “10 Ways to Measure Social Value” on page 41. We found hundreds of competing tools, of which foundations and NGOs generally use one set, governments another, and academics yet another. In addition to discovering this segmentation, our survey suggested two more reasons why so few metrics guide real decisions. First, most metrics assume that value is objective, and therefore discoverable through analysis. Yet as most modern economists now agree, value is not an objective fact. Instead, value emerges from the interaction of supply and demand, and ultimately reflects what people or organizations are willing to pay. Because so few of the tools reflect this, they are inevitably misaligned with an organization’s strategic and operational priorities."
-
"At first, the collaboration struck many scientists as worrisome — they would be giving up ownership of data, and anyone could use it, publish papers, maybe even misinterpret it and publish information that was wrong.
But Alzheimer’s researchers and drug companies realized they had little choice.
“Companies were caught in a prisoner’s dilemma,” said Dr. Jason Karlawish, an Alzheimer’s researcher at the University of Pennsylvania. “They all wanted to move the field forward, but no one wanted to take the risks of doing it.”"
-
"One of the most common discussion points we see around Detroit is comparing it to other cities. Although we believe Detroit stands on its own, it’s natural to try to relate our situation with others.
However, many comparisons are drawn to cities like San Francisco, New York, and Boston – and then we got to thinking."
links for 2010-08-14
-
"Fatal crush conditions occur in crowds with tragic frequency. Event organisers and architects are often criticised for failing to consider the causes and implications of crush, but the reality is that the prediction and mitigation of such conditions offers a significant technical challenge. Full treatment of physical force within crowd simulations is precise but computationally expensive; the more common method of human interpretation of results is computationally "cheap" but subjective and time-consuming. In this paper we propose an alternative method for the analysis of crowd behaviour, which uses information theory to measure crowd disorder. We show how this technique may be easily incorporated into an existing simulation framework, and validate it against an historical event. Our results show that this method offers an effective and efficient route towards automatic detection of crush."
links for 2010-08-12
-
"In this work we introduce a new linear time compression algorithm, called "Re-pair for Trees", which compresses ranked ordered trees using linear straight-line context-free tree grammars. Such grammars generalize straight-line context-free string grammars and allow basic tree operations, like traversal along edges, to be executed without prior decompression. Our algorithm can be considered as a generalization of the "Re-pair" algorithm developed by N. Jesper Larsson and Alistair Moffat in 2000. The latter algorithm is a dictionary-based compression algorithm for strings. We also introduce a succinct coding which is specialized in further compressing the grammars generated by our algorithm. This is accomplished without loosing the ability do directly execute queries on this compressed representation of the input tree.…"
-
"Until I started using Specification Workshops as the name for a collaborative meeting about acceptance tests, it was very hard to convince business users to participate. But a simple change in naming made the problem go away."
-
"Weather data didn’t come to be because of an Open Government Directive. It wasn’t created because of a White House mandate. Government did not release the data and then enterprising people built companies on top of it. It’s more accurate to make the argument that we have a national weather service because of one man’s deep desire to keep his job and to get promoted to colonel in the Army. It could be a vast network of lobbyists to help that man get promoted, or the vast network of lobbyists from shipping companies trying to get access to data already being created. Or it could be that it was just pretty obvious that access to weather data would save lives."
-
"While company filings and regulations may not be the most glamorous parts of your startup, they're absolutely critical to the success of your business and safety of your personal savings. Here's a quick rundown of the laws and regulations you need to consider when creating a startup. Of course, depending on your type of business, hiring a tax accountant or good attorney with specific experience in your industry can go a long way to helping you steer clear of trouble."
-
"We study, using simulated experiments inspired by thin film magnetic domain patterns, the feasibility of phase retrieval in X-ray diffractive imaging in the presence of intrinsic charge scattering given only photon-shot-noise limited diffraction data. We detail a reconstruction algorithm to recover the sample's magnetization distribution under such conditions, and compare its performance with that of Fourier transform holography. Concerning the design of future experiments, we also chart out the reconstruction limits of diffractive imaging when photon- shot-noise and the intensity of charge scattering noise are independently varied. This work is directly relevant to the time-resolved imaging of magnetic dynamics using coherent and ultrafast radiation from X-ray free electron lasers and also to broader classes of diffractive imaging experiments which suffer noisy data, missing data or both."
-
"We have shown that there exists a large ensemble of minimal Boolean networks that show reliable and robust dynamics. The networks are minimal in the respect that the number of connections of a node is not larger than necessary for obtaining a desired reliable trajectory. A reliable trajectory is an attractor of the dynamics of the network that does not change when the update schedule is changed or randomized. This means that under parallel update, at each time step only one node changes its state. The reliable trajectories were chosen at random, given a fixed average number of flips per node. High robustness was achieved by using an evolutionary algorithm that modifies the update functions and that accepts only those changes that do not decrease robustness.…"
-
"Our 2.546-approximation is quite simple. The performance guarantee is based on a simple area argument. This gives rise to the following question: what is the smallest square that suffices for packing any set of circles of total area 1? We believe the worst-case may very well be shown in Figure 13, which yields a lower bound of 1.471299… We believe there are relatively easy ways to improve the upper bound."
-
"In this paper we present a technique for fusion of optical and thermal face images based on image pixel fusion approach. Out of several factors, which affect face recognition performance in case of visual images, illumination changes are a significant factor that needs to be addressed. Thermal images are better in handling illumination conditions but not very consistent in capturing texture details of the faces. Other factors like sunglasses, beard, moustache etc also play active role in adding complicacies to the recognition process. Fusion of thermal and visual images is a solution to overcome the drawbacks present in the individual thermal and visual face images.…"
-
"OpenCV (Open Source Computer Vision) is a library of programming functions for real time computer vision.
OpenCV is released under a BSD license, it is free for both academic and commercial use.
The library has >500 optimized algorithms (see figure below). It is used around the world, has >2M downloads and >40K people in the user group. Uses range from interactive art, to mine inspection, stitching maps on the web on through advanced robotics." -
"We study the dynamics of the Naming Game [Baronchelli et al., (2006) J. Stat. Mech.: Theory Exp. P06014] in empirical social networks. This stylized agent-based model captures essential features of agreement dynamics in a network of autonomous agents, corresponding to the development of shared classification schemes in a network of artificial agents or opinion spreading and social dynamics in social networks. Our study focuses on the impact that communities in the underlying social graphs have on the outcome of the agreement process. We find that networks with strong community structure hinder the system from reaching global agreement; the evolution of the Naming Game in these networks maintains clusters of coexisting opinions indefinitely. Further, we investigate agent-based network strategies to facilitate convergence to global consensus."
-
"Neuronal activity arises from an interaction between ongoing firing generated spontaneously by neural circuits and responses driven by external stimuli. Using mean-field analysis, we ask how a neural network that intrinsically generates chaotic patterns of activity can remain sensitive to extrinsic input. We find that inputs not only drive network responses, they also actively suppress ongoing activity, ultimately leading to a phase transition in which chaos is completely eliminated. The critical input intensity at the phase transition is a non-monotonic function of stimulus frequency, revealing a "resonant" frequency at which the input is most effective at suppressing chaos even though the power spectrum of the spontaneous activity peaks at zero and falls exponentially. A prediction of our analysis is that the variance of neural responses should be most strongly suppressed at frequencies matching the range over which many sensory systems operate."
-
"Developing large-scale distributed applications can be a daunting task. object-based environments have attempted to alleviate problems by providing distributed objects that look like local objects. We advocate that this approach has actually only made matters worse, as the developer needs to be aware of many intricate internal details in order to adequately handle partial failures. The result is an increase of application complexity. We present an alternative in which distribution transparency is lessened in favor of clearer semantics. In particular, we argue that a developer should always be offered the unambiguous semantics of local objects, and that distribution comes from copying those objects to where they are needed. We claim that it is often sufficient to provide only small, immutable objects, along with facilities to group objects into clusters."
-
"The task of image restration is to find the spatial correspondence of two or more given images. In this paper we assume that the correspondence is given either by an Euclidean, or by an affine volume-preserving transformation. Since the registration problem can be seen as an optimization problem on a finite dimensional Lie group, we use a recently developed framework of approximate-Newton methods on manifolds, which leads to locally quadratically convergent algorithms. To reduce numerical costs, we present two strategies: One makes use of the quasi Monte Carlo Method and the other ends up with an algorithm acting on spline function spaces. An extension for multi-modal image registration is given as well."
-
"l1-minimization refers to finding the minimum l1-norm solution to an underdetermined linear system b=Ax. It has recently received much attention, mainly motivated by the new compressive sensing theory that shows that under quite general conditions the minimum l1-norm solution is also the sparsest solution to the system of linear equations. Although the underlying problem is a linear program, conventional algorithms such as interior-point methods suffer from poor scalability for large-scale real world problems. A number of accelerated algorithms have been recently proposed that take advantage of the special structure of the l1-minimization problem. In this paper, we provide a comprehensive review of five representative approaches, namely, Gradient Projection, Homotopy, Iterative Shrinkage-Thresholding, Proximal Gradient, and Augmented Lagrange Multiplier. …"
-
"Sparse data models, where data is assumed to be well represented as a linear combination of a few elements from a dictionary, have gained considerable attention in recent years, and their use has led to state-of-the-art results in many signal and image processing tasks. It is now well understood that the choice of the sparsity regularization term is critical in the success of such models. Based on a codelength minimization interpretation of sparse coding, and using tools from universal coding theory, we propose a framework for designing sparsity regularization terms which have theoretical and practical advantages when compared to the more standard l0 or l1 ones. The presentation of the framework and theoretical foundations is complemented with examples that show its practical advantages in image denoising, zooming and classification."
-
"In the present article we emphasize the importance of modeling time in the context of agent-based models. To this end, we present a (selective) survey of the Cellular Automata-literature on updating and draw parallels to the issue of agent activation in agent-based models. By means of two simple models, Schelling's segregation model and Epstein's demographic prisoner's dilemma we investigate the influence of choosing different regimes of agent activation. Our experiments indicate that timing is not a critical issue for very simple models but bears huge influence on model behavior and results as soon as the degree of complexity increases only so slightly. After a brief review of the way commonly used ABM simulation environments handle the issue of timing, we draw some tentative conclusions about the importance of timing and the need for more research towards that direction, similar to the concerted effort on updating in cellular automata."
-
"Transient algebra is a multi-valued algebra for hazard detection in gate circuits. Sequences of alternating 0's and 1's, called transients, represent signal values, and gates are modeled by extensions of boolean functions to transients. Formulas for computing the output transient of a gate from the input transients are known for NOT, AND, OR} and XOR gates and their complements, but, in general, even the problem of deciding whether the length of the output transient exceeds a given bound is NP-complete. We propose a method of evaluating extensions of general boolean functions. We introduce and study a class of functions with the following property: Instead of evaluating an extension of a boolean function on a given set of transients, it is possible to get the same value by using transients derived from the given ones, but having length at most 3. We prove that all functions of three variables, as well as certain other functions, have this property, and can be efficiently evaluated."
-
"We define a two-step learner for RFSAs based on an observation table by using an algorithm for minimal DFAs to build a table for the reversal of the language in question and showing that we can derive the minimal RFSA from it after some simple modifications. We compare the algorithm to two other table-based ones of which one (by Bollig et al. 2009) infers a RFSA directly, and the other is another two-step learner proposed by the author. We focus on the criterion of query complexity."
-
"…The model is based on the emergent properties of generic genetic networks, it does not refer to specific control circuits and it can therefore hold for a wide class of lineages. The model points to a peculiar role of cellular noise in differentiation, which has never been hypothesized so far, and leads to non trivial predictions which could be subject to experimental testing."
-
"This paper presents a novel type-2 Fuzzy logic System to define the Shape of a facial component with the crisp output. This work is the part of our main research effort to design a system (called FASY) which offers a novel face construction approach based on the textual description and also extracts and analyzes the facial components from a face image by an efficient technique. The Fuzzy model, designed in this paper, takes crisp value of width and height of a facial component and produces the crisp value of Shape for different facial components. This method is designed using Matlab 6.5 and Visual Basic 6.0 and tested with the facial components extracted from 200 male and female face images of different ages from different face databases."
-
"We consider a generalization of the Gabriel graph, the witness Gabriel graph. Given a set of vertices P and a set of witnesses W in the plane, there is an edge ab between two points of P in the witness Gabriel graph GG-(P,W) if and only if the closed disk with diameter ab does not contain any witness point (besides possibly a and/or b). We study several properties of the witness Gabriel graph, both as a proximity graph and as a new tool in graph drawing."
-
"We use computer simulation to study crystal-forming model proteins equipped with interactions that are both orientationally specific and nonspecific. Distinct dynamical pathways of crystal formation can be selected by tuning the strengths of these interactions. When the nonspecific interaction is strong, liquidlike clustering can precede crystallization; when it is weak, growth can proceed via ordered nuclei. Crystal yields are in certain parameter regimes enhanced by the nonspecific interaction, even though it promotes association without local crystalline order. Our results suggest that equipping nanoscale components with weak nonspecific interactions (such as depletion attractions) can alter both their dynamical pathway of assembly and optimize the yield of the resulting material."
-
"Many complex systems present an intrinsic bipartite nature and are often described and modeled in terms of networks [1-5]. Examples include movies and actors [1, 2, 4], authors and scientific papers [6-9], email accounts and emails [10], plants and animals that pollinate them [11, 12]. Bipartite networks are often very heterogeneous in the number of relationships that the elements of one set establish with the elements of the other set. … Here we introduce an unsupervised method to statistically validate each link of the projected network against a null hypothesis taking into account the heterogeneity of the system. We apply our method to three different systems…. In all these systems, both different in size and level of heterogeneity, we find that our method is able to detect network structures which are informative about the system…"
-
"Work out which bits of the system you know least about. Create the scenarios and have conversations around those bits of the system. You don’t have to grow the system from the beginning – you can pick any point you like! Which bits of the system make you most uncomfortable? Which bits make your stakeholders most uncomfortable?"
-
"So where does this gulf of experiences come from, why is cucumber loved by some and hated by others. At the risk of over-generalisation and mischaracterisation I recently came up with a theory: the cucumber detractors are not using cuke the way it was intended."
links for 2010-08-11
-
"We tried a variant of this program starting in 2002 with a more solid economy and we are still trying to recover from how that movie ended. Einstein defined insanity as doing the same thing over and over again and expecting different results. And since the financial sector profited so handsomely from this exercise the last time around, they have every reason to encourage this insanity."
-
"In this work we investigate a novel approach to handle the challenges of face recognition, which includes rotation, scale, occlusion, illumination etc. Here, we have used thermal face images as those are capable to minimize the affect of illumination changes and occlusion due to moustache, beards, adornments etc. The proposed approach registers the training and testing thermal face images in polar coordinate, which is capable to handle complicacies introduced by scaling and rotation. Line features are extracted from thermal polar images and feature vectors are constructed using these line. Feature vectors thus obtained passes through principal component analysis (PCA) for the dimensionality reduction of feature vectors.…"
-
"In 1961 Herbert Simon and Albert Ando published the theory behind the long-term behavior of a dynamical system that can be described by a nearly completely decomposable matrix. Over the past fifty years this theory has been used in a variety of contexts, including queueing theory, computer performance, and ecology. In all these applications, the structure of the system is known and the point of interest is the various states the system passes through on its way to some long-term equilibrium. This paper looks at this problem from the other direction. That is, we develop a technique for using the evolution of the system to tell us about its initial structure, and we use this technique to develop a new algorithm for data clustering."
-
"Smoothie Charts is a really small chartling library designed for live streaming data. I built it to reduce the headaches I was getting from watching charts jerkily updating every second. What you're looking up now is pretty much all it does. If you like that, then read on."
links for 2010-08-10
-
"We discuss the complex dynamics of a non-linear random networks model, as a function of the connectivity k between the elements of the network. We show that this class of networks exhibit an order-chaos phase transition for a critical connectivity k = 2. Also, we show that both, pairwise correlation and complexity measures are maximized in dynamically critical networks. These results are in good agreement with the previously reported studies on random Boolean networks and random threshold networks, and show once again that critical networks provide an optimal coordination of diverse behavior."
-
"The new algorithm by Yadava and his colleagues goes one step further. It uses a more realistic physics model of the x-ray source, the detectors, and the x-ray beam. Each of these three is assumed to have specific diameters instead of being considered a point or a line, Yadava says. Depending on the type of scan, the technique is better than ASIR at cutting image noise, and thus the x-rays can be even less intense. The researchers got high-quality abdomen scans of a human model using an eighth of the radiation dose of a conventional scan."
-
"Because sharing data resulted in a citation, I wonder how long will it take for Open Data advocates to start using this “open data citation advantage” as an argument for sharing data?"
-
"As we continue to monitor the markets for evidence of Quote Stuffing and Strange Sequences (Crop Circles), we find that there are dozens if not hundreds of examples to choose from on any given day. As such, this page will be updated often with charts demonstrating this activity.
The common theme with the charts shown on this page is they are obviously all generated in code and are algorithmic. Some demonstrate bizarre price or size cycling, some demonstrate large burst of quotes in extremely short time frames and some will demonstrate both. In most cases these sequences are from a single exchange with no other exchange quoting in the same time frame."
-
"While analyzing HFT (High Frequency Trading) quote counts, we were shocked to find cases where one exchange was sending an extremely high number of quotes for one stock in a single second: as high as 5,000 quotes in 1 second! During May 6, there were hundreds of times that a single stock had over 1,000 quotes from one exchange in a single second. Even more disturbing, there doesn't seem to be any economic justification for this. In many of the cases, the bid/offer is well outside the National Best Bid/Offer (NBBO). We decided to analyze a handful of these cases in detail and graphed the sequential bid/offers to better understand them. What we discovered was a manipulative device with destabilizing effect."
-
"…We tested our algorithm on a small benchmark graph and on a network of about 500 papers in information science (weighted with the Salton index of bibliographic coupling). In our tests, this approach results in characteristic ranges of resolution where a large resolution change does not lead to a growth of the natural community. Such stable modules were also obtained by applying the LFK algorithm but since we determine communities for all resolution values in one run, our approach is faster than the LFK reference. And our algorithm reveals the hierarchical structure of the graph more easily."
links for 2010-08-04
-
"We introduce a simple criterion, the CAR score, for ranking and selecting variables in linear regression. The CAR score arises naturally in the best predictor formulation of the linear model, offers a canonical decomposition of the proportion of explained variance, and also takes account of correlation and grouping structure among explanatory variables. As population quantity the CAR score is not tied to any specific inference paradigm. Variable selection based on AIC, $C_p$, BIC, and other information criteria is shown to be equivalent to thresholding CAR scores at a fixed level, whereas using false discovery rates corresponds to an adaptive cutoff. In computer simulations we show that CAR scores are highly effective for variable selection with a prediction error that compares favorable with the elastic net and similar regression procedures. We illustrate the approach by analyzing diabetes data as well as gene expression data from the human frontal cortex."
-
"A parameterisation of generalised network clustering, in the form of four-motif prevalences, is presented. This involves three real parameters that are conditional on one- two- and three-motif prevalences. Interpretations of these real parameters are presented that motivate a set of rewiring schemes to create appropriately clustered networks. Finally, the dynamical implications of higher order structure, as parameterised, for a contact process are considered."
-
"We have developed a framework to study the struc- ture and function of complex networks in purely geomet- ric terms. In this framework, two common properties of complex network topologies, strong heterogeneity and clustering, turn out to be simple reflections of the basic properties of an underlying hyperbolic geometry. Heterogeneity, measured in terms of the power-law degree distribution exponent, is a function of the negative curvature of the hyperbolic space, while clustering reflects its metric property."
-
"We describe the structure of the graphs with the smallest average distance and the largest average clustering given their order and size. There is usually a unique graph with the largest average clustering, which at the same time has the smallest possible average distance. In contrast, there are many graphs with the same minimum average distance, ignoring their average clustering. The form of these graphs is shown with analytical arguments. Finally, we measure the sensitivity to rewiring of this architecture with respect to the clustering coefficient, and we devise a method to make these networks more robust with respect to vertex removal."
-
"In this paper we have proposed and studied a simple model of contribution games, in which agents can invest a fixed budget into different relationships. Our results show that collaboration between pairs of players can lead to instabilities and non-existence of pairwise equilibria. For certain classes of functions, the existence of pairwise equilibria is even NP-hard to decide. This implies that it is impossible to decide efficiently if a set of players in a game can reach a pairwise equilibrium. For many interesting classes of games, however, we are able to show existence and bound the price of anarchy to 2. This includes, for instance, a class of games with general convex functions, or minimum effort games with concave functions. Here we are also able to show that best response dynamics converge to pairwise equilibria."
-
"On a more philosophical level, our approach points at novel questions that go beyond supervised and semi-supervised learning. What benefit do labels provide over unsupervised training? Can our framework be extended to semi-supervised learning where a few labels do exist? Can it be extended to non-classification scenarios such as margin based regression or margin based structured prediction? When are the assumptions likely to hold and how can we make our framework even more resistant to deviations from them? These questions and others form new and exciting open research directions."
-
"High-dimensional correlated data pose challenges in model selection and predictive learning. In this paper, we derive an iterative thresholding technique for generalized linear models (GLMs) with possibly nonorthogonal designs. We propose a family of $\Theta$-estimators which are associated with penalized likelihoods and can be computed by thresholding-based iterative procedures. It can also be used to robustify GLMs and extend the canonical $M$-estimators.…"
-
"The problem of arriving at a principled method of pricing goods and services was very satisfactorily solved for conventional goods; however, this solution is not applicable to digital goods. After taking into consideration idiosyncrasies of the digital realm, we give a market model that is appropriate for the digital setting, and a notion of equilibrium for it. We also prove existence of equilibrium for our market model."
-
"Nonlinear bilateral filters (BF) deliver a fine blend of computational simplicity and blur-free denoising. However, little is known about their nature, noise-suppressing properties, and optimal choices of filter parameters. Our study is meant to fill this gap-explaining the underlying mechanism of bilateral filtering and providing the methodology for optimal filter selection. Practical application to CT image denoising is discussed to illustrate our results."

