“The missionary zeal of many Bayesians of old has been matched, in the other direction, by a view among some theoreticians that Bayesian methods are absurd-not merely misguided but obviously wrong in principle. We consider several examples, beginning with Feller’s classic text on probability theory and continuing with more recent cases such as the perceived Bayesian nature of the so-called doomsday argument. We analyze in this note the intellectual background behind various misconceptions about Bayesian statistics, without aiming at a complete historical coverage of the reasons for this dismissal.”
“In this paper, we considered the problem of finding a subset of covariates in a high-dimensional space that affect the output variable when there is a block struc– ture in the covariates. In the context of association mapping, we proposed a regression-based model with a Markov chain prior that encodes the information in the correlation structure such as distance and re– combination rate between adjacent SNP markers. We demonstrated on the simulated and mouse data that our proposed algorithm can be used to identify groups of SNP markers as a relevant block of causal SNPs.
The idea of representing the correlation structure as a Markov chain in a variable selection method to learn grouped relevant variables can be generalized to use a graphical model as a prior in a variable selection prob– lem to represent an arbitrary correlation structure in variables in a high-dimensional space. Another inter– esting extension of the model is to model a structure in output variables as well when measurements of mul– tiple output variables are available.”
“The paper goes into lots more detail, but the lesson for researchers is extremely simple: don’t cross the streams of data-analysis. Set up your analysis stream and then use it on all of your data. Same hardware, same software, same settings.
Imagine you’re doing a study comparing brain structure in two groups. Halfway through analyzing your data, you upgrade your MacOS. All of the brains you analyze after that will be, say, 5% “bigger”. That’ll certainly make your data much noisier, and if you happen to analyze most of Group A before Group B, it’ll give you a false positive finding.
Sometimes you just can’t avoid changes in hardware or software — IT techs have a habit of upgrading things without asking — but in these cases, you should run the same data under the old and the new regime to see if it’s making a difference.
Finally, it would be wrong to blame FreeSurfer for this. I’d be surprised if they were any worse than the other software packages. Mixing and matching versions is something that the FreeSurfer developers specifically warn against. This paper shows why.”
“I’ve been critical of objects and the idea of reference for a while now. To me sentences and propositions, by virtue of their role as “moves” in social interactions, are likely to have priority in a properly objective account of meaning. Many putative objects (e.g. corporations or mutable digital documents) border on being fictional, gaining their objecthood only through what we say about them; and many referring phrases seem to refer to different things, depending on what is being predicated. I think this opinion would make me what Peregrin calls a “strong inferentialist”.
Eventually I hope that thinking clearly about semantics ought to (among other things) help bring calm to the current mass hysteria which is the Semantic Web and Linked Data, and help steer all of that energy expenditure to improve its consequence.”
“In the last few years many real-world networks have been found to show a so-called community structure organization. Much effort has been devoted in the literature to develop methods and algorithms that can efficiently highlight this hidden structure of the network, traditionally by partitioning the graph. Since network representation can be very complex and can contain different variants in the traditional graph model, each algorithm in the literature focuses on some of these properties and establishes, explicitly or implicitly, its own definition of community. According to this definition it then extracts the communities that are able to reflect only some of the features of real communities. The aim of this survey is to provide a manual for the community discovery problem. Given a meta definition of what a community in a social network is, our aim is to organize the main categories of community discovery based on their own definition of community. Given a desired definition of community and the features of a problem (size of network, direction of edges, multidimensionality, and so on) this review paper is designed to provide a set of approaches that researchers could focus on.”
“We develop an exact wavelet transform on the three-dimensional ball (i.e. on the solid sphere), which we name the flaglet transform. For this purpose we first construct an exact harmonic transform on the radial line using damped Laguerre polynomials and develop a corresponding quadrature rule. Combined with the spherical harmonic transform, this approach leads to a sampling theorem on the ball and a novel three-dimensional decomposition which we call the Fourier-Laguerre transform. We relate this new transform to the well-known Fourier-Bessel decomposition and show that band-limitness in the Fourier-Laguerre basis is a sufficient condition to compute the Fourier-Bessel decomposition exactly. We then construct the flaglet transform on the ball through a harmonic tiling, which is exact thanks to the exactness of the Fourier-Laguerre transform (from which the name flaglets is coined). The corresponding wavelet kernels have compact localisation properties in real and harmonic space and their angular aperture is invariant under radial translation. We introduce a multiresolution algorithm to perform the flaglet transform rapidly, while capturing all information at each wavelet scale in the minimal number of samples on the ball. Our implementation of these new tools achieves floating point precision and is made publicly available. We perform numerical experiments demonstrating the speed and accuracy of these libraries and illustrate their capabilities on a simple denoising example.”
“When agents with independent priors bid for a single item, Myerson’s optimal auction maximizes expected revenue, whereas Vickrey’s second-price auction optimizes social welfare. We address the natural question of trade-offs between the two criteria, that is, auctions that optimize, say, revenue under the constraint that the welfare is above a given level. If one allows for randomized mechanisms, it is easy to see that there are polynomial-time mechanisms that achieve any point in the trade-off (the Pareto curve) between revenue and welfare. We investigate whether one can achieve the same guarantees using deterministic mechanisms. We provide a negative answer to this question by showing that this is a (weakly) NP-hard problem. On the positive side, we provide polynomial-time deterministic mechanisms that approximate with arbitrary precision any point of the trade-off between these two fundamental objectives for the case of two bidders, even when the valuations are correlated arbitrarily. The major problem left open by our work is whether there is such an algorithm for three or more bidders with independent valuation distributions.”
“Many images nowadays are captured from behind the glasses and may have certain stains discrepancy because of glass and must be processed to make differentiation between the glass and objects behind it. This research paper proposes an algorithm to remove the damaged or corrupted part of the image and make it consistent with other part of the image and to segment objects behind the glass. The damaged part is removed using total variation inpainting method and segmentation is done using kmeans clustering, anisotropic diffusion and watershed transformation. The final output is obtained by interpolation. This algorithm can be useful to applications in which some part of the images are corrupted due to data transmission or needs to segment objects from an image for further processing.”
“But it’ll be your decision, not inertia or fate. The ongoing cadence of asking these questions (and, maybe, the content of any answers you come up with) will convene an open space for you to live in. A world where whatever you do is right.”
“The Pirate University is an on-line bulletin board on which students post requests for academic publications. You can compare it to an academic wish list. Others, who know where to find these publications, reply and if possible, provide links to the resources searched. The Pirate University is not providing, storing or sharing copyrighted material.
An important question is if the uploading of articles, publications is legal. If you are the copyright holder of the article requested, there should be no problem. Also in certain cases, if you or your institute have acquired the rights of the publication, or if it is free of rights, there shouldn’t be a problem. It is probably best to consult with your librarian to see which kind of publication is okay to share on the Internet.”
“…We propose a novel cooperative iterative algorithm which copes with the communication constraints imposed by the network and shows remarkable performance. Our main result is a rigorous proof of the convergence of the algorithm and a characterization of the limit behavior. We also show that, in the limit when the number of sensors goes to infinity, the common unknown parameter is estimated with arbitrary small error, while the classification error converges to that of the optimal centralized maximum likelihood estimator. We also show numerical results that validate the theoretical analysis and support their possible generalization. We compare our strategy with the Expectation-Maximization algorithm and we discuss trade-offs in terms of robustness, speed of convergence and implementation simplicity.”
“Function graphs are graphs representable by intersections of continuous real-valued functions on the interval [0,1] and are known to be exactly the complements of comparability graphs. As such they are recognizable in polynomial time. Function graphs generalize permutation graphs, which arise when all functions considered are linear.
We focus on the problem of extending partial representations, which generalizes the recognition problem. We observe that for permutation graphs an easy extension of Golumbic’s comparability graph recognition algorithm can be exploited. This approach fails for function graphs. Nevertheless, we present a polynomial-time algorithm for extending a partial representation of a graph by functions defined on the entire interval [0,1] provided for some of the vertices. On the other hand, we show that if a partial representation consists of functions defined on subintervals of [0,1], then the problem of extending this representation to functions on the entire interval [0,1] becomes NP-complete.”
“Exemplar-based clustering methods have been shown to produce state-of-the-art results on a number of synthetic and real-world clustering problems. They are appealing because they offer computational benefits over latent-mean models and can handle arbitrary pairwise similarity measures between data points. However, when trying to recover underlying structure in clustering problems, tailored similarity measures are often not enough; we also desire control over the distribution of cluster sizes. Priors such as Dirichlet process priors allow the number of clusters to be unspecified while expressing priors over data partitions. To our knowledge, they have not been applied to exemplar-based models. We show how to incorporate priors, including Dirichlet process priors, into the recently introduced affinity propagation algorithm. We develop an efficient maxproduct belief propagation algorithm for our new model and demonstrate experimentally how the expanded range of clustering priors allows us to better recover true clusterings in situations where we have some information about the generating process.”
’”ALLOFOURWORKHASGIVENME A VERYSTRONG view,” Richard Boyatzis told me one afternoon. The consulting firm Boyatzis heads, McBer and Company, was founded by David McClelland in 1963. Its specialty has been analyzing what people actually do in business jobs—not what their job descriptions say, but how they spend their time and which skills seem most important to their success. “I’ve come to see that whenever a group institutes a credentialing process, whether by licensing or insisting on advanced degrees, the espoused rhetoric is to enforce the standards of professionalism. This is true whether it’s among accountants or plumbers or physicians. But the observed consequences always seem to be these two: the exclusion of certain groups, whether by intention or not, and the establishment of mediocre performance standards.“‘
“In an attempt to find a polynomial-time algorithm for the edge-clique cover problem on cographs we tried to prove that the edge-clique graphs of cographs have bounded rankwidth. However, this is not the case. In this note we show that the edge-clique graphs of cocktail party graphs have unbounded rank width.”
“We present an algorithm that identifies the reasoning patterns of agents in a game, by iteratively examining the graph structure of its Multi-Agent Influence Diagram (MAID) representation. If the decision of an agent participates in no reasoning patterns, then we can effectively ignore that decision for the purpose of calculating a Nash equilibrium for the game. In some cases, this can lead to exponential time savings in the process of equilibrium calculation. Moreover, our algorithm can be used to enumerate the reasoning patterns in a game, which can be useful for constructing more effective computerized agents interacting with humans.”
“I could not tell you how many times I’ve encountered libertarian arguments about law that assume that individuals can and ought to use contracts to protect themselves against just this sort of contingency. Don’t worry about users clicking “I agree” to overreaching terms of service; if they truly cared about the terms, they’d negotiate for better ones. Don’t worry about people who refuse to buy health insurance; they’re making a rational choice for themselves. Don’t worry about minority shareholders, don’t worry about franchisees, don’t worry about all the other groups that find themselves on the wrong end of a bargain that always seems to tip against them in the long run—if they wanted better protections, they could and should have negotiated for them up front.
Except they don’t. They never do. And really. If the uber-libertarians of the Cato institute can’t watch out for themselves, what hope is there for the rest of us?”
“A preliminary study has show this dynamic lighting is pleasantly perceived. A group of volunteers conducted office duties for four days subjected to light from a 30 by 60 cm ceiling display. On the first day the light was static, on the second it fluctuated gently, and on the third the changes in lighting conditions were more rapid. On the final day, the majority of volunteers (80 percent) said they wished to continue working with the rapidly fluctuating light.”
“Nonetheless, the success of JMLR does provide a clue that the cost of running a premier journal might be far less than publishers imply, if they were to rethink the process substantially — maybe not $10 per article, but surely far less than the $5,000 average revenue per article that scholarly publishers currently receive. This expectation is borne out by the several non-profit and commercial open-access journal publishers that are able to operate in the black with publication fees a fraction of that average.”
“Perhaps not quite as exciting as revivified dinosaurs, but still amazing: plants from the late Paleolithic era are claimed to have been regenerated from fossil material (Yashina et al. 2012. Regeneration of whole fertile plants from 30,000-y-old fruit tissue buried in Siberian permafrost. PNAS doi:10.10.73/pnas.1118386109). This has very little to do with systems biology, but I was interested and thought you would be too. Perhaps I could trace some kind of connection (did you know that our Artist-no-longer-in-Residence, Brian Knep, shared two Academy Awards for his work on the movie Jurassic Park?) but it would be forced and hardly worth it. Better to admit to mild frivolity.”
“Two nebbish Representatives, one Republican and one Democrat, distinguished only by their lack of legislative or political importance, sponsored the bill on behalf of the big boys who fast-tracked it under the radar (they learned from the SOPA debacle). Forget ideology or boasts about carrying a copy of the Constitution in the breast pocket of their suit, whether you are in an archconservative Congressional district or an ultraliberal one, almost every member of Congress voted “aye” to trash multiple amendments in the Bill of Rights.
Almost every one.
This is an accelerating trend in recent years and in particular, a bipartisan theme of the 112th Congress, which views Constitutional rights of nobodies as an anachronistic hindrance to the interests (or convenience) of their powerful and wealthy political supporters. Our elected officials and their backers increasingly share an oligarchic class interest that in important matters, trumps the Kabuki partisanship of FOXnews and MSNBC and inculcates a technocratic admiration for the “efficiency” of select police states.”
“Lately I’ve been thinking a lot about the impact of the move to a thick client architecture for web applications, and I’m becoming more and more certain that this means that Rails-style MVC frameworks on the server-side are going to end up being phased out in favour of leaner and meaner frameworks that better address the new needs of thick-client architecture.”
“The industry is inhibited by several obstacles that executives themselves candidly acknowledge. One involves the difficulty of changing the behavior of people trained in the ways of a mature and monopolistic industry. Still another is the unavoidable fact that the part of the newspaper industry that is growing, digital, continues to provide only a small part of the revenue, while the part that is shrinking, print, provides most of the money-a paradox that is difficult to navigate and hard to resist. One pervasive feeling is that 15 years into the digital transition, executives still feel they are in the early stages of figuring out a how to proceed.”
“My readers ask me that question more than just about any other. So here’s my question back: What is school for? (Click the link to get to the free download).
I’ve just published a 30,000 word manifesto, totally free to read, share, translate, print and, most of all, use to start an essential conversation. It took a lot to get it to you, and I’m encouraging you to take a few minutes to check it out. After you read it, perhaps you’ll write one of your own.”
“…In this paper, we introduce a replay method– ology for contextual bandit algorithm evaluation. Different from simulator-based approaches, our method is completely data-driven and very easy to adapt to different applications. More importantly, our method can provide provably unbi– ased evaluations. Our empirical results on a large-scale news article recommendation dataset collected from Yahoo! Front Page conform well with our theoretical results. Furthermore, comparisons between our offline replay and online bucket evaluation of several contextual bandit algorithms show ac– curacy and effectiveness of our offline evaluation method.”
“In terms of holdings, the fund has a heavy focus on health care (41.7%) and industrial revenue (28.3%) bonds which comprise the lion’s share of the assets. State exposure is also pretty spread out as California bonds comprise about 18.3% of the fund while New York bonds are another 11.1%. Beyond these two, the rest of the top five is rounded out by the territory of Puerto Rico (8.5%), and then the states of New Jersey (7.6%) and Ohio (6.9%). Maturity levels are tilted towards the longer end of the curve giving the fund a greater focus on yield but also on interest rate risk as well. Thanks to this, the fund pays out a 30 Day SEC Yield of 5.55%, a level that transfers over to 8.5% in tax equivalent terms for those in the top tax bracket.”
“A good visualization conveys key information to those who may have trouble interpreting numbers and/or statistics, which can make your findings accessible to a wider audience (more on this below). Visualizations also give your audience a break from lexical processing, which is especially useful when you are presenting your findings–people can listen to you and process the findings from a well-designed visual at the same time, but most people have trouble listening while reading your PowerPoint bullet points. Visualizations also convey key information embedded in massive amounts of data, which can aid your own exploratory analysis of data, no matter how massive.”
“These enormous, intricate designs are the creation of one man and his snowshoes. Simon Beck, whom you can see in the last image below, conceives and executes these patterns, turning fresh snow into alien messages.”
“The debate is an old one. New however is the ease – though, I can assure you, editing away objects in Photoshop in a clean way is far from easy – and the extend in which manipulation can be done today. Magic Wand-ing, cloning and gaussian blur are now part even of the vocabularies of a growing number of retirees with too much spare time and an interest in photography. The expectation that a beautiful images ‘has to be manipulated’ is so ingrained that we don’t even pause to question our own paranoia.
But, rather than bothering ourselves with the question if an image is 100% ‘true’ – something that, in my own opinion will never be – we should ask ourselves if adaptations (not ‘manipulation’) are reasonable; if they add or remove something essential to the image. Erasing some zits from a model’s face is perfectly reasonable. Making eyes a little brighter can be legitimate. Blowing up boobs, lengthening legs and shrinking waists is not.
Ethics surrounding photo-manipulation is never so simple as a yes or no question and is not even a ‘thin line’; it is a mine-field in a no man’s land. That careers can be scuttled be being ‘caught’ doing so is sad, in particular because in the trench war between ‘digital compositors’ and photo-purists, there appears to be little willingness to come to a middle ground.”
“This deck of 91 full-colour cards names what skilled facilitators and other participants do to make things work. The content is more specific than values and less specific than tips and techniques, cutting across existing methodologies with a designer’s eye to capture the patterns that repeat. The deck can be used to plan sesssions, reflect on and debrief them, provide guidance, and share responsibility for making the process go well. It has the potential to provide a common reference point for practitioners, and serve as a framework and learning tool for those studying the field. ”
“A new model for computer simulation of solids, composed of bonded particles, is proposed. Vectors rigidly connected with particles are used for description of deformation of a single bond. The expression for potential energy of the bond and corresponding expressions for forces and moments are proposed. Formulas, connecting parameters of the model with longitudinal, shear, bending and torsional stiffnesses of the bond, are derived. It is shown that the model allows to describe any values of the bond stiffnesses exactly. Two different calibration procedures depending on bond length/thickness ratio are proposed. It is shown that parameters of model can be chosen so that under small deformations the bond is equivalent to either Bernoulli-Euler or Timoshenko rod or short cylinder connecting particles. Simple expressions, connecting parameters of V-model with geometrical and mechanical characteristics of the bond, are derived. Computer simulation of dynamical buckling of the straight discrete rod and discrete half-spherical shell is carried out.”
“Inspired by birds flying through cluttered environments such as dense forests, this paper studies the theoretical foundations of a novel motion planning problem: high-speed navigation through a randomly-generated obstacle field when only the statistics of the obstacle generating process are known a priori. Resembling a planar forest environment, the obstacle generating process is assumed to determine the locations and sizes of disk-shaped obstacles. When this process is ergodic, and under mild technical conditions on the dynamics of the bird, it is shown that the existence of an infinite collision-free trajectory through the forest exhibits a phase transition. On one hand, if the bird flies faster than a certain critical speed, then, with probability one, there is no infinite collision-free trajectory, i.e., the bird will eventually collide with some tree, almost surely, regardless of the planning algorithm governing the bird’s motion. On the other hand, if the bird flies slower than this critical speed, then there exists at least one infinite collision-free trajectory, almost surely. Lower and upper bounds on the critical speed are derived for the special case of a homogeneous Poisson forest considering a simple model for the bird’s dynamics. For the same case, an equivalent percolation model is provided. Using this model, the phase diagram is approximated in Monte-Carlo simulations. This paper also establishes novel connections between robot motion planning and statistical physics through ergodic theory and percolation theory, which may be of independent interest.”
“In this paper we propose a method based on interacting particle physics, devised for clustering Euclidean datasets without initial constraints or conditions. We model any dataset as an interacting particle system, whose elements correspond to particles that interact through a simplified version of Lennard-Jones potentials. In so doing, mutual attractive interactions allow to identify groups of proximal particles. The main outcome of this modeling task is an adjacency matrix, taken as input by a community detection algorithm aimed to identify different partitions. The underlying conjecture is that, using a multiresolution analysis, the adopted model allows to find the right number of clusters for any given dataset. Experimental results, performed in comparison with a classical clustering algorithm, confirm this assumption.”
“This paper develops generalizations of empowerment to continuous states. Empowerment is a recently introduced information-theoretic quantity motivated by hypotheses about the efficiency of the sensorimotor loop in biological organisms, but also from considerations stemming from curiosity-driven learning. Empowemerment measures, for agent-environment systems with stochastic transitions, how much influence an agent has on its environment, but only that influence that can be sensed by the agent sensors. It is an information-theoretic generalization of joint controllability (influence on environment) and observability (measurement by sensors) of the environment by the agent, both controllability and observability being usually defined in control theory as the dimensionality of the control/observation spaces.…”
“In evaluating prediction markets (and other crowd-prediction mechanisms), investigators have repeatedly observed a so-called “wisdom of crowds” effect, which roughly says that the average of participants performs much better than the average participant. The market price—an average or at least aggregate of traders’ beliefs—offers a better estimate than most any individual trader’s opinion. In this paper, we ask a stronger question: how does the market price compare to the best trader’s belief, not just the average trader. We measure the market’s worst-case log regret, a notion common in machine learning theory. To arrive at a meaningful answer, we need to assume something about how traders behave. We suppose that every trader optimizes according to the Kelly criteria, a strategy that provably maximizes the compound growth of wealth over an (infinite) sequence of market interactions. We show several consequences.…”
“We investigate the ray-length distributions for two different rectangular versions of Gilbert’s tessellation. In the full rectangular version, lines extend either horizontally (with east– and west-growing rays) or vertically (north– and south-growing rays) from seed points which form a Poisson point process, each ray stopping when another ray is met. In the half rectangular version, east and south growing rays do not interact with west and north rays. For the half rectangular tessellation we compute analytically, via recursion, a series expansion for the ray-length distribution, whilst for the full rectangular version we develop an accurate simulation technique, based in part on the stopping-set theory of Zuyev, to accomplish the same. We demonstrate the remarkable fact that plots of the two distributions appear to be identical when the intensity of seeds in the half model is twice that in the full model. Our paper explores this coincidence mindful of the fact that, for one model, our results are from a simulation (with inherent sampling error).…”
“A number of representation schemes have been presented for use within Learning Classifier Systems, ranging from binary encodings to neural networks. This paper presents results from an investigation into using discrete and fuzzy dynamical system representations within the XCSF Learning Classifier System. In particular, asynchronous Random Boolean Networks are used to represent the traditional condition-action production system rules in the discrete case and asynchronous Fuzzy Logic Networks in the continuous-valued case. It is shown possible to use self-adaptive, open-ended evolution to design an ensemble of such dynamical systems within XCSF to solve a number of well-known test problems.”
“In this paper we define what we call an affinity system, which is a set of individuals, each with a vector characterizing its preference for all other individuals in the set. The preference of a member can be given either by a ranking of all members or by a weighted vector that defines the degrees of its affinity to others. Affinity systems are useful for modeling social systems as well as general data sets, as social interactions are often determined by affinities among the members. We also define a natural notion of (potentially overlapping) communities in an affinity system, in which the members of a given community collectively prefer each other to anyone else outside the community. Thus these communities are “self-determined” or “self-certified” by the affinity system. We provide a tight polynomial bound on the number of self-determined communities as a function of the robustness of the community. Moreover, we present a polynomial-time algorithm for enumerating these communities, as well as a local algorithm with a strong stochastic performance guarantee that can find a community in time nearly linear in the of size the community.…”
“We introduce the optimal obstacle placement with disambiguations problem wherein the goal is to place true obstacles in an environment cluttered with false obstacles so as to maximize the total traversal length of a navigating agent (NAVA). Prior to the traversal, NAVA is given location information and probabilistic estimates of each disk-shaped hindrance (hereinafter referred to as disk) being a true obstacle. The NAVA can disambiguate a disk’s status only when situated on its boundary. There exists an obstacle placing agent (OPA) that locates obstacles prior to NAVA’s traversal. The goal of OPA is to place true obstacles in between the clutter in such a way that NAVA’s traversal length is maximized in a game-theoretic sense.…”
“We review the observations and the basic laws describing the essential aspects of collective motion — being one of the most common and spectacular manifestation of coordinated behavior. Our aim is to provide a balanced discussion of the various facets of this highly multidisciplinary field, including experiments, mathematical methods and models for simulations, so that readers with a variety of background could get both the basics and a broader, more detailed picture of the field. The observations we report on include systems consisting of units ranging from macromolecules through metallic rods and robots to groups of animals and people. Some emphasis is put on models that are simple and realistic enough to reproduce the numerous related observations and are useful for developing concepts for a better understanding of the complexity of systems consisting of many simultaneously moving entities. As such, these models allow the establishing of a few fundamental principles of flocking. In particular, it is demonstrated, that in spite of considerable differences, a number of deep analogies exist between equilibrium statistical physics systems and those made of self-propelled (in most cases living) units. In both cases only a few well defined macroscopic/collective states occur and the transitions between these states follow a similar scenario, involving discontinuity and algebraic divergences.”
“Data collection at a massive scale is becoming ubiquitous in a wide variety of settings, from vast offline databases to streaming real-time information. Learning algorithms deployed in such contexts must rely on single-pass inference, where the data history is never revisited. In streaming contexts, learning must also be temporally adaptive to remain up-to-date against unforeseen changes in the data generating mechanism. Although rapidly growing, the online Bayesian inference literature remains challenged by massive data and transient, evolving data streams. Non-parametric modelling techniques can prove particularly ill-suited, as the complexity of the model is allowed to increase with the sample size. In this work, we take steps to overcome these challenges by porting standard streaming techniques, like data discarding and downweighting, into a fully Bayesian framework via the use of informative priors and active learning heuristics. We showcase our methods by augmenting a modern non-parametric modelling framework, dynamic trees, and illustrate its performance on a number of practical examples. The end product is a powerful streaming regression and classification tool, whose performance compares favourably to the state-of-the-art.”
The fallacy in this reasoning is glaring. The candidate supported by progressives — President Obama — himself holds heinous views on a slew of critical issues and himself has done heinous things with the power he has been vested. He has slaughtered civilians — Muslim children by the dozens — not once or twice, but continuously in numerous nations with drones, cluster bombs and other forms of attack. He has sought to overturn a global ban on cluster bombs. He has institutionalized the power of Presidents — in secret and with no checks — to target American citizens for assassination-by-CIA, far from any battlefield. He has waged an unprecedented war against whistleblowers, the protection of which was once a liberal shibboleth. He rendered permanently irrelevant the War Powers Resolution, a crown jewel in the list of post-Vietnam liberal accomplishments, and thus enshrined the power of Presidents to wage war even in the face of a Congressional vote against it. His obsession with secrecy is so extreme that it has become darkly laughable in its manifestations, and he even worked to amend the Freedom of Information Act (another crown jewel of liberal legislative successes) when compliance became inconvenient.
Laws are more often than not an annoyance, despite their aim to improve the legal framework in any given field. Free Software (AKA “Open Source”) has thrieved despite the absence of any legal recognition by the law, if not in spite of rules that clearly are shaped around proprietary software. In many jurisdictions it has passed the enforceability test. So, no laws seem necessary to make it work. Yet, can some legal principle be put forward, and included in some laws, to help?
We introduce the problem of reconstructing a sequence of multidimensional real vectors where some of the data are missing. This problem contains regression and mapping inversion as particular cases where the pattern of missing data is independent of the sequence index. The problem is hard because it involves possibly multivalued mappings at each vector in the sequence, where the missing variables can take more than one value given the present variables; and the set of missing variables can vary from one vector to the next. To solve this problem, we propose an algorithm based on two redundancy assumptions: vector redundancy (the data live in a low-dimensional manifold), so that the present variables constrain the missing ones; and sequence redundancy (e.g. continuity), so that consecutive vectors constrain each other. We capture the low-dimensional nature of the data in a probabilistic way with a joint density model, here the generative topographic mapping, which results in a Gaussian mixture. Candidate reconstructions at each vector are obtained as all the modes of the conditional distribution of missing variables given present variables. The reconstructed sequence is obtained by minimising a global constraint, here the sequence length, by dynamic programming. We present experimental results for a toy problem and for inverse kinematics of a robot arm.
In many data acquisition systems it is common to observe signals whose amplitudes have been clipped. We present two new algorithms for recovering a clipped signal by leveraging the model assumption that the underlying signal is sparse in the frequency domain. Both algorithms employ ideas commonly used in the field of Compressive Sensing; the first is a modified version of Reweighted $ell_1$ minimization, and the second is a modification of a simple greedy algorithm known as Trivial Pursuit. An empirical investigation shows that both approaches can recover signals with significant levels of clipping
Nowadays we are often faced with huge databases resulting from the rapid growth of data storage technologies. This is particularly true when dealing with music databases. In this context, it is essential to have techniques and tools able to discriminate properties from these massive sets. In this work, we report on a statistical analysis of more than ten thousand songs aiming to obtain a complexity hierarchy. Our approach is based on the estimation of the permutation entropy combined with an intensive complexity measure, building up the complexity-entropy causality plane. The results obtained indicate that this representation space is very promising to discriminate songs as well as to allow a relative quantitative comparison among songs. Additionally, we believe that the here-reported method may be applied in practical situations since it is simple, robust and has a fast numerical implementation.
We consider the problem of online audio source separation. Existing algorithms adopt either a sliding block approach or a stochastic gradient approach, which is faster but less accurate. Also, they rely either on spatial cues or on spectral cues and cannot separate certain mixtures. In this paper, we design a general online audio source separation framework that combines both approaches and both types of cues. The model parameters are estimated in the Maximum Likelihood (ML) sense using a Generalised Expectation Maximisation (GEM) algorithm with multiplicative updates. The separation performance is evaluated as a function of the block size and the step size and compared to that of an offline algorithm.