These are my recent Pinboard.in links:
[1006.5366] “Not only defended but also applied”: The perceived absurdity of Bayesian inference
“The missionary zeal of many Bayesians of old has been matched, in the other direction, by a view among some theoreticians that Bayesian methods are absurd-not merely misguided but obviously wrong in principle. We consider several examples, beginning with Feller’s classic text on probability theory and continuing with more recent cases such as the perceived Bayesian nature of the so-called doomsday argument. We analyze in this note the intellectual background behind various misconceptions about Bayesian statistics, without aiming at a complete historical coverage of the reasons for this dismissal.”
social-dynamics statistics martial-arts-schools[1206.3268] Feature Selection via Block-Regularized Regression
“In this paper, we considered the problem of finding a subset of covariates in a high-dimensional space that affect the output variable when there is a block struc– ture in the covariates. In the context of association mapping, we proposed a regression-based model with a Markov chain prior that encodes the information in the correlation structure such as distance and re– combination rate between adjacent SNP markers. We demonstrated on the simulated and mouse data that our proposed algorithm can be used to identify groups of SNP markers as a relevant block of causal SNPs. The idea of representing the correlation structure as a Markov chain in a variable selection method to learn grouped relevant variables can be generalized to use a graphical model as a prior in a variable selection prob– lem to represent an arbitrary correlation structure in variables in a high-dimensional space. Another inter– esting extension of the model is to model a structure in output variables as well when measurements of mul– tiple output variables are available.”
statistics bioinformatics algorithms data-mining feature-extraction- “So, are you tired of this old and bored git log screen?”
yes software-development git tricks-n-tips bash Neuroskeptic: Brains are Different on Macs
“The paper goes into lots more detail, but the lesson for researchers is extremely simple: don’t cross the streams of data-analysis. Set up your analysis stream and then use it on all of your data. Same hardware, same software, same settings. Imagine you’re doing a study comparing brain structure in two groups. Halfway through analyzing your data, you upgrade your MacOS. All of the brains you analyze after that will be, say, 5% “bigger”. That’ll certainly make your data much noisier, and if you happen to analyze most of Group A before Group B, it’ll give you a false positive finding. Sometimes you just can’t avoid changes in hardware or software — IT techs have a habit of upgrading things without asking — but in these cases, you should run the same data under the old and the new regime to see if it’s making a difference. Finally, it would be wrong to blame FreeSurfer for this. I’d be surprised if they were any worse than the other software packages. Mixing and matching versions is something that the FreeSurfer developers specifically warn against. This paper shows why.”
data-analysis reproducibility technical-assumptions anomalies-are-where-you-find-them- “I’ve been critical of objects and the idea of reference for a while now. To me sentences and propositions, by virtue of their role as “moves” in social interactions, are likely to have priority in a properly objective account of meaning. Many putative objects (e.g. corporations or mutable digital documents) border on being fictional, gaining their objecthood only through what we say about them; and many referring phrases seem to refer to different things, depending on what is being predicated. I think this opinion would make me what Peregrin calls a “strong inferentialist”. Eventually I hope that thinking clearly about semantics ought to (among other things) help bring calm to the current mass hysteria which is the Semantic Web and Linked Data, and help steer all of that energy expenditure to improve its consequence.”
pragmatism indirect-links philosophy talking-about-thinking-and-the-reverse [1206.3552] A Classification for Community Discovery Methods in Complex Networks
“In the last few years many real-world networks have been found to show a so-called community structure organization. Much effort has been devoted in the literature to develop methods and algorithms that can efficiently highlight this hidden structure of the network, traditionally by partitioning the graph. Since network representation can be very complex and can contain different variants in the traditional graph model, each algorithm in the literature focuses on some of these properties and establishes, explicitly or implicitly, its own definition of community. According to this definition it then extracts the communities that are able to reflect only some of the features of real communities. The aim of this survey is to provide a manual for the community discovery problem. Given a meta definition of what a community in a social network is, our aim is to organize the main categories of community discovery based on their own definition of community. Given a desired definition of community and the features of a problem (size of network, direction of edges, multidimensionality, and so on) this review paper is designed to provide a set of approaches that researchers could focus on.”
via:cshalizi graph-theory community classification algorithms nudge- “We develop an exact wavelet transform on the three-dimensional ball (i.e. on the solid sphere), which we name the flaglet transform. For this purpose we first construct an exact harmonic transform on the radial line using damped Laguerre polynomials and develop a corresponding quadrature rule. Combined with the spherical harmonic transform, this approach leads to a sampling theorem on the ball and a novel three-dimensional decomposition which we call the Fourier-Laguerre transform. We relate this new transform to the well-known Fourier-Bessel decomposition and show that band-limitness in the Fourier-Laguerre basis is a sufficient condition to compute the Fourier-Bessel decomposition exactly. We then construct the flaglet transform on the ball through a harmonic tiling, which is exact thanks to the exactness of the Fourier-Laguerre transform (from which the name flaglets is coined). The corresponding wavelet kernels have compact localisation properties in real and harmonic space and their angular aperture is invariant under radial translation. We introduce a multiresolution algorithm to perform the flaglet transform rapidly, while capturing all information at each wavelet scale in the minimal number of samples on the ball. Our implementation of these new tools achieves floating point precision and is made publicly available. We perform numerical experiments demonstrating the speed and accuracy of these libraries and illustrate their capabilities on a simple denoising example.”
wavelets geometry representation-theory signal-processing answer-languages - “When agents with independent priors bid for a single item, Myerson’s optimal auction maximizes expected revenue, whereas Vickrey’s second-price auction optimizes social welfare. We address the natural question of trade-offs between the two criteria, that is, auctions that optimize, say, revenue under the constraint that the welfare is above a given level. If one allows for randomized mechanisms, it is easy to see that there are polynomial-time mechanisms that achieve any point in the trade-off (the Pareto curve) between revenue and welfare. We investigate whether one can achieve the same guarantees using deterministic mechanisms. We provide a negative answer to this question by showing that this is a (weakly) NP-hard problem. On the positive side, we provide polynomial-time deterministic mechanisms that approximate with arbitrary precision any point of the trade-off between these two fundamental objectives for the case of two bidders, even when the valuations are correlated arbitrarily. The major problem left open by our work is whether there is such an algorithm for three or more bidders with independent valuation distributions.”
algorithms Pareto-front performance-measure multiobjective-optimization - “Symbolsets are semantic symbol fonts. They work in modern browsers and anywhere OpenType features are supported.”
typography unicode [1204.6653] Elimination of Glass Artifacts and Object Segmentation
“Many images nowadays are captured from behind the glasses and may have certain stains discrepancy because of glass and must be processed to make differentiation between the glass and objects behind it. This research paper proposes an algorithm to remove the damaged or corrupted part of the image and make it consistent with other part of the image and to segment objects behind the glass. The damaged part is removed using total variation inpainting method and segmentation is done using kmeans clustering, anisotropic diffusion and watershed transformation. The final output is obtained by interpolation. This algorithm can be useful to applications in which some part of the images are corrupted due to data transmission or needs to segment objects from an image for further processing.”
image-segmentation image-processing nudge-targets algorithms- “But it’ll be your decision, not inertia or fate. The ongoing cadence of asking these questions (and, maybe, the content of any answers you come up with) will convene an open space for you to live in. A world where whatever you do is right.”
this - “The Pirate University is an on-line bulletin board on which students post requests for academic publications. You can compare it to an academic wish list. Others, who know where to find these publications, reply and if possible, provide links to the resources searched. The Pirate University is not providing, storing or sharing copyrighted material. An important question is if the uploading of articles, publications is legal. If you are the copyright holder of the article requested, there should be no problem. Also in certain cases, if you or your institute have acquired the rights of the publication, or if it is free of rights, there shouldn’t be a problem. It is probably best to consult with your librarian to see which kind of publication is okay to share on the Internet.”
academic-culture publishing collaboration crowdsourcing librarians open-access scholarship [1206.3793] A distributed classification/estimation algorithm for sensor networks
“…We propose a novel cooperative iterative algorithm which copes with the communication constraints imposed by the network and shows remarkable performance. Our main result is a rigorous proof of the convergence of the algorithm and a characterization of the limit behavior. We also show that, in the limit when the number of sensors goes to infinity, the common unknown parameter is estimated with arbitrary small error, while the classification error converges to that of the optimal centralized maximum likelihood estimator. We also show numerical results that validate the theoretical analysis and support their possible generalization. We compare our strategy with the Expectation-Maximization algorithm and we discuss trade-offs in terms of robustness, speed of convergence and implementation simplicity.”
distributed-processing collective-behavior sensor-networks algorithms nudge-targets[1204.6391] Extending partial representations of function graphs and permutation graphs
“Function graphs are graphs representable by intersections of continuous real-valued functions on the interval [0,1] and are known to be exactly the complements of comparability graphs. As such they are recognizable in polynomial time. Function graphs generalize permutation graphs, which arise when all functions considered are linear. We focus on the problem of extending partial representations, which generalizes the recognition problem. We observe that for permutation graphs an easy extension of Golumbic’s comparability graph recognition algorithm can be exploited. This approach fails for function graphs. Nevertheless, we present a polynomial-time algorithm for extending a partial representation of a graph by functions defined on the entire interval [0,1] provided for some of the vertices. On the other hand, we show that if a partial representation consists of functions defined on subintervals of [0,1], then the problem of extending this representation to functions on the entire interval [0,1] becomes NP-complete.”
graph-theory math-i-didn’t-know representation-theory ontology interesting[1206.3294] Flexible Priors for Exemplar-based Clustering
“Exemplar-based clustering methods have been shown to produce state-of-the-art results on a number of synthetic and real-world clustering problems. They are appealing because they offer computational benefits over latent-mean models and can handle arbitrary pairwise similarity measures between data points. However, when trying to recover underlying structure in clustering problems, tailored similarity measures are often not enough; we also desire control over the distribution of cluster sizes. Priors such as Dirichlet process priors allow the number of clusters to be unspecified while expressing priors over data partitions. To our knowledge, they have not been applied to exemplar-based models. We show how to incorporate priors, including Dirichlet process priors, into the recently introduced affinity propagation algorithm. We develop an efficient maxproduct belief propagation algorithm for our new model and demonstrate experimentally how the expanded range of clustering priors allows us to better recover true clusterings in situations where we have some information about the generating process.”
clustering algorithmsMagazine — The Case Against Credentialism — The Atlantic
’”ALL OF OUR WORK HAS GIVEN ME A VERY STRONG view,” Richard Boyatzis told me one afternoon. The consulting firm Boyatzis heads, McBer and Company, was founded by David McClelland in 1963. Its specialty has been analyzing what people actually do in business jobs—not what their job descriptions say, but how they spend their time and which skills seem most important to their success. “I’ve come to see that whenever a group institutes a credentialing process, whether by licensing or insisting on advanced degrees, the espoused rhetoric is to enforce the standards of professionalism. This is true whether it’s among accountants or plumbers or physicians. But the observed consequences always seem to be these two: the exclusion of certain groups, whether by intention or not, and the establishment of mediocre performance standards.“‘
professionalization credentialing Andrew-Abbott-smiles-in-Chicago authority expertise cultural-assumptions disintermediation-targets[1205.2483] Edge-clique graphs of cocktail parties have unbounded rankwidth
“In an attempt to find a polynomial-time algorithm for the edge-clique cover problem on cographs we tried to prove that the edge-clique graphs of cographs have bounded rankwidth. However, this is not the case. In this note we show that the edge-clique graphs of cocktail party graphs have unbounded rank width.”
open-questions nudge-targets graph-theory algorithms[1206.3235] Identifying reasoning patterns in games
“We present an algorithm that identifies the reasoning patterns of agents in a game, by iteratively examining the graph structure of its Multi-Agent Influence Diagram (MAID) representation. If the decision of an agent participates in no reasoning patterns, then we can effectively ignore that decision for the purpose of calculating a Nash equilibrium for the game. In some cases, this can lead to exponential time savings in the process of equilibrium calculation. Moreover, our algorithm can be used to enumerate the reasoning patterns in a game, which can be useful for constructing more effective computerized agents interacting with humans.”
game-theory inference strategy nudge-targets learning-by-watching