Posted Tue May 16 23:59:39 2017

1. John Steele Gordon, "A Short History of the National Debt," The Wall Street Journal, http://online.wsj.com/article/SB123491373049303821.html.

2. 向松祚, "大历史视角的美国国债和美元霸权"， http://www.caogen.com/blog/Infor_detail.aspx?ID=117&articleId=28690

3. Wikipedia contributors, "United States public debt," Wikipedia, The Free Encyclopedia, http://en.wikipedia.org/w/index.php?title=Plagiarism&oldid=5139350. And the series of articles on United States budget and debt topics.



More concise result with only image file paths:

$plutil -p ~/Library/Preferences/com.apple.desktop.plist | grep "\"ImageFilePath\"" | sed 's/^ *//' | sort "ImageFilePath" => "/Users/lumeng/Library/Caches/com.kuvva.Kuvva-Wallpapers/Wallpapers/2728c21dc3d845f1b05c209cded453a3" "ImageFilePath" => "/Users/lumeng/Library/Caches/com.kuvva.Kuvva-Wallpapers/Wallpapers/ba94b2bd2056554428d9792fe6523877" "ImageFilePath" => "/Users/lumeng/Library/Caches/com.kuvva.Kuvva-Wallpapers/Wallpapers/ba94b2bd2056554428d9792fe6523877" "ImageFilePath" => "/Users/lumeng/Library/Caches/com.kuvva.Kuvva-Wallpapers/Wallpapers/ba94b2bd2056554428d9792fe6523877" "ImageFilePath" => "/Users/lumeng/Library/Containers/com.kuvva.Kuvva-Wallpapers/Data/Library/Application Support/Kuvva/1ac5adfe90bd5b73367769111329ddf9de396d3e" "ImageFilePath" => "/Users/lumeng/Library/Containers/com.kuvva.Kuvva-Wallpapers/Data/Library/Application Support/Kuvva/1ac5adfe90bd5b73367769111329ddf9de396d3e" "ImageFilePath" => "/Users/lumeng/Library/Containers/com.kuvva.Kuvva-Wallpapers/Data/Library/Application Support/Kuvva/1ac5adfe90bd5b73367769111329ddf9de396d3e" "ImageFilePath" => "/Users/lumeng/Library/Containers/com.kuvva.Kuvva-Wallpapers/Data/Library/Application Support/Kuvva/2f47f6bcbb93acb1729beeb9177e2a64c3b688dc"  It turns out every image used as wallpaper will be cached in /Users/lumeng/Library/Containers/com.kuvva.Kuvva-Wallpapers/Data/Library/Application Support/Kuvva/  I then added cron job to save the images in a different directory: $ crontab -l
## save newest Kuvva wallpaper image
@hourly rsync -qrihp "/Users/lumeng/Library/Containers/com.kuvva.Kuvva-Wallpapers/Data/Library/Application Support/Kuvva/" "/Users/lumeng/Dropbox/Image/Wallpaper/kuvva_wallpaper/new/"

## copy new images out
@daily rsync -qrihp "/Users/lumeng/Dropbox/Image/Wallpaper/kuvva_wallpaper/new/" "/Users/lumeng/Dropbox-x4430/Dropbox/DataSpace-Dropbox/Image/Wallpaper/kuvva_wallpaper/"

## add filename extesion to image files names


The original file names don't have extension. One can use ImageMagick's identify utility to detect the image file type and add file name extension accordingly. The Bash script I'm using to do that add-filename-extension-to-image-files.sh:

#!/usr/bin/env bash

## Summary: identify image file type and add filename extension appropriately when applicable
## Example:
## $add-filename-extension-to-image-files.sh -d ~/temp IMAGE_FILE_DIR= while getopts d:d opt; do case$opt in
d)
IMAGE_FILE_DIR=$OPTARG ;; esac done if <span class="createlink"> -d &#36;IMAGE FILE DIR </span>; then FILES=find$IMAGE_FILE_DIR -maxdepth 1 -type f
else
FILES=()
fi

IMAGEMAGICK_IDENTIFY_BIN="/opt/local/bin/identify"

for file in $FILES do filename=$(basename "$file") extension="${filename##*.}"
if [ "$extension" == "$filename" ]; then
newextension=$IMAGEMAGICK_IDENTIFY_BIN$file | cut -d ' '  -f 2 | tr '[:upper:]' '[:lower:]'
rsync -qrthp $file "$file.$newextension" rm$file
fi
done

## END

Posted Tue May 16 23:59:39 2017

Meng Lu, 2013-7-6

Suppose you want to remove newlines in between the Chinese characters:

南海少年遊俠客，



-- note that the 1st and 2nd Chinese comma ， actually have two or more white spaces following them -- and change it to a single line

南海少年遊俠客，詩成嘯傲凌滄州，曾因酒醉鞭名馬，生怕情深累美人。


One way to do this is using Emacs.

## Use query-replace-regexp

Press M-x, and type query-replace-regexp, or as a shortcut C-M-%;

Type regexp to match:

$$[[:nonascii:\]]$$ *
*$$[[:nonascii:\]]$$


Note the line break in the regexp need to be typed into the Emacs minibuffer with C-q C-j.

Type regexp to substitute:

\1\2


This means the white space character(s) (if any) and newline character between non-ASCII characters will be removed in the substituted version, so the result is the character on the first line followed by that on the second line.

## Use fill-paragraph

• Set fill-column variable, which controls how wide a line of text can go before line-wrapping to a very large value for the current buffer: C-x f, 10000000

• Highlight the paragraph you'd like to modify: move cursor to the beginning, hold Shift down and move up and down arrow to extend and decrease the selection;

• Press M-x, and type fill-paragraph.

This should remove all newline characters in the text. Interestingly, if there are multiple white space characters at the end of lines before the new line character, it will keep one of them:

南海少年遊俠客， 詩成嘯傲凌滄州， 曾因酒醉鞭名馬，生怕情深累美人。


Note there is an additional white space after the 1st and the 2nd ，.

The single white space character is actually still redundant, that can be corrected by

M-x query-replace-regexp
， *
，

Posted Tue May 16 23:59:39 2017

After reading "The ASA's statement on p-values: context, process, and purpose", and some other related references, here are some excerpts and notes I took on p-value and null-hypothesis significance testing.

• American Statistical Association (ASA) has stated the following five principles about p-values and null hypothesis significance testing:

1. "P-values can indicate how incompatible the data are with a specified statistical model."
2. "P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone."
3. " … It is a statement about data in relation to a specified hypothetical explanation, and is not a statement about the explanation itself."
4. "Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold."
5. "… Practices that reduce data analysis or scientific inference to mechanical “bright-line” rules (such as “p < 0.05”) for justifying scientific claims or conclusions can lead to erroneous beliefs and poor decision-making. …"
6. "Proper inference requires full reporting and transparency."
7. "A p-value, or statistical significance, does not measure the size of an effect or the importance of a result."
8. "… Smaller p-values do not necessarily imply the presence of larger or more important effects, and larger p-values do not imply a lack of importance or even lack of effect. Any effect, no matter how tiny, can produce a small p-value if the sample size or measurement precision is high enough, and large effects may produce unimpressive p-values if the sample size is small or measurements are imprecise. …"
• Null hypothesis is usually a hypothesis that assumes that observed data and its distribution is a result of random chances rather than that of effects caused by some intrinsic mechanisms. It is usually what is to disapprove or to reject in order to establish evidence to or belief in that there is some real effect due to underlying intrinsic mechanism. In turn, the details of the statistical model used in this evaluation can be used to make quantitative estimations on properties of the underlying mechanism.

• The p-value is the probability that one has falsely rejected the null hypothesis.

• The smaller is, the smaller the chance is that one has falsely rejected the null hypothesis.
• Being able to reject or not being able to reject the null hypothesis may tells one if the observed data suggests that there is an effect, however, it does not tell one how much an effect there is and if the effect is true. See effect size.
• "a p-value near 0.05 taken by itself offers only weak evidence against the null hypothesis".
• UK statistician and geneticist Sir Ronald Fisher introduced the p-value in the 1920s. "The p-value was never meant to be used the way it's used today."
• As ASA p-value principle No. 3 states, the decision to reject the null hypothesis should not be based solely on if p-value passes a "bright-line" threshold. Rather, in order to reject the null hypothesis, one must make a subjective judgment involving the degree of risk acceptable for being wrong. The degree of risk of being wrong may be specified in terms of confidence levels which characterizes the sampling variability.

• Alternative ways used for referring to data cherry-picking include data dredging, significance chasing, significance questing, selective inference, p-hacking, snooping, fishing, and double-dipping.

• "The difference between statistically significant and statistically insignificant is not, itself, statistically significant."

• "According to one widely used calculation [1], a p-value of 0.01 corresponds to a false-alarm probability of at least 11%, depending on the underlying probability that there is a true effect; a p-value of 0.05 raises that chance to at least 29%." See the following figure:

p-value and probable cause.png

## Some related concepts

• The standard score, or z-score is the deviation from the mean in units of standard deviation. A small p-value corresponds to a large positive z-score.

• 68-95-99.7 rule

• Magnitude - How big is the effect? Large effects are more compelling than small ones.
• Articulation - How specific is it? Precise statements are more compelling than imprecise ones.
• Generality - How generally does it apply?
• Interestingness - interesting effects are those that "have the potential, through empirical analysis, to change what people believe about an important issue".
• Credibility - Credible claims are more compelling than incredible ones. The researcher must show that the claims made are credible.

## References

1. Goodman, "Of P-Values and Bayes: A Modest Proposal," S. N. Epidemiology 12, 295–297 (2001), http://journals.lww.com/epidem/fulltext/2001/05000/of_p_values_and_bayes__a_modest_proposal.6.aspx
Posted Tue May 16 23:59:39 2017

## Principle of the single big jump

Learned about an interesting probability principle about random walk called "principle of the single big jump"

A high overall displacement with respect to the orgin resulting from doing random walk might be essentially contributed by one single very large step, i.e. a leap.

Technically, assume the step sizes are independent random variables $X_1, X_2, \cdots$ with heavy-tailed (technically, subexponential) distributions, then the maximum and the sum have the same asymptotic distribution. That is, as x goes to infinity,

$\lim${x \rightarrow \infty} P( X_1 + X_2 + \cdots + X_n > x ) = \lim{x \rightarrow \infty} P( max(X_1, X_2, \cdots , X_n > x )

References:

• http://arxiv.org/pdf/math/0509605v1.pdf

• http://www.johndcook.com/blog/2011/08/09/single-big-jump-principle/

Posted Tue May 16 23:59:39 2017 Tags:

Here is some notes about running forms:

## The Pose Method running

• as foot lands on ground, shoulder, hip and ankle are aligned to a straight line and sloped forward

• lands on forefoot

## Chi running

Aims at mechanical efficiency and injury-free.

• postural alignment: align shoulder, hip and anckle to a straight line (posture line)

• midfoot (full foot) landing under posture line

• forward lean from relaxed ankles

• heel lift via knee bend

• arm swing to the rear

Kenyan elite runner Moses Mosop's running form in slow motion. (It seems sometimes his landing foot is in front of his hip horizontal position, but not sure if it's also in front of the overall center of gravity of the body.

## Bare foot running

Just one key point: run barefoot and let it land naturally on forefoot instead of on heel. Wearing running shoes makes it comfort to land on heels and easy to injure yourself.

Posted Tue May 16 23:59:39 2017 Tags:

Gradle (http://www.gradle.org/) is a modern software building automation tool. I've recently picked it up. Here is a note on setting up a minimal and not-entirely-trivial Java project using Gradle and building (including running tests) with it: 〈 Installing and configuring Gradle 〉.

Posted Tue May 16 23:59:39 2017 Tags:

A few numbers to illustrate how bad time complexity is, just so the well known fact is more concrete. It does not take much to make it useless for practical purposes in the Big Data era.

The plot vs. , should illustrate how large the difference between the two growth rates and really is: at some time scale, a computation with input size can be computed within a day by but will take about two human lifetimes (about 50000 days) to do:

Posted Tue May 16 23:59:39 2017

With the advent of Java 8 and java.util.stream and lambda expressions in it, one can do data munging in Java 8 as the following:

Java 8:

public static void main(String[] args) {

HashMap<Integer, Character> nucleobases = new HashMap<> ();
nucleobases.put(1, 'A');
nucleobases.put(2, 'G');
nucleobases.put(3, 'T');
nucleobases.put(4, 'C');

range(0, 100)
// generate a stream containing random strings of length 10
.mapToObj(i -> randomNucleicAcidSequence(new Random(), nucleobases, 10))
// sort the elements in the stream to natural ordering
.sorted()
// group strings into sub-lists and wrap them into a stream
.collect(groupingBy(name -> name.charAt(0)))
// print each sub-list's common initial letter and the constituent strings
.forEach((letter, names) -> System.out.println(letter
+ "\n\t"
+ names.stream().collect(joining("\n\t"))));
}

public static String randomNucleicAcidSequence(Random r, Map<Integer, Character> map, int length) {
return r.ints(1, 4).limit(length).mapToObj(
x -> Character.toString(map.get(x))).collect(Collectors.joining());
}


↑ Code in GitHub

This is remarkbly similar to a program written in Mathematica using the munging style I use all the time:

(* There is not a built-in RandomString[…] *)
nucleobases = {"A", "G", "T", "C"};

randomNucleicAcidSequence[length_Integer] := StringJoin[nucleobases[[#]] & /@ RandomInteger[{1, 4}, length]]

Composition[
Print[StringJoin[{#[[1]],":\n",StringJoin[Riffle[#[[2]],"\n"]]}]]& /@ # &,
(First[Characters[Part[#, 1]]]-> #) & /@ # &,
GatherBy[#, First[Characters[#]]&]&,
Sort,
Map[Function[{i}, randomNucleicAcidSequence[10]], #] &
][Range[0, 100]]


The output of both program print the mucleic acid sequences grouped by initial neuclobase:

A:
AAAATACCTC
AAATAATCAT
AACAGATACG
ACAACTACGG
ACCATCAAAT
...
C:
CAACGGGGTT
CAAGAAGAGC
CACACCCACA
CACACTCTAC
CAGGACCGGA
...
G:
GAACGTTTCA
GAACTAAGCG
GACCAGTTCT
GAGAACACGT
GAGCCGCCAC
...
T:
TAAAATTGCC
TAAGGTGAGG
TAGCCGGTTA
TAGGCGGTGA
TAGTTCGAGC
...


Data streams and algorithms for processing them is a recently hot research area in computer science. It seems to me it will be natural for Java standard library to include more and more stream algorithms in future.

Related:

Posted Tue May 16 23:59:39 2017

After reading many references and trying various settings, I finally got Oracle JDK 7 and JDK 8 installed on my MacBook Pro running OS X 10.9.2, while leaving macOS's own system Java which is Java 6 unchanged. I've written a note 〈 Installing and configuring Java on Mac 〉.

Posted Tue May 16 23:59:39 2017 Tags:

Using or matching backslash character ('\') correctly with grep or similar tools such as ack takes some investigation. Here is a note on what I found.

## GNU grep

I created a text file temp.txt with

\
\\
\\\
\:0251


What I get from trying to match each line with grep (GNU bash 3.2.48) from Bash command line:

• Use "\\\\" to match \

$grep -x --color=always "\\" temp.txt \ • So to match \\, \\\\\\\\ is needed:$ grep -x --color=always "\\\\" temp.txt \

• To Match \:0251 which is the Mathematica InputForm string of a Chinese character:

grep -x --color=always "\\:0251" temp.txt :0251 Posted Tue May 16 23:59:39 2017 A tree graph is uniquely identified by the set of its edges {1->2, 1->3, 2->4, 2->5, 2->6, 3->7, 3->8}  This syntax is used in Wolfram Language's graph specification. The actual picture of the graph can be displayed in Mathematica with: Graph[{1 -> 2, 1 -> 3, 2 -> 4, 2 -> 5, 2 -> 6, 3 -> 7, 3 -> 8}]  The question is: is there a better way to represent the graph in a compact, readable, and text-only form? ## Why? A few use cases which might give some motivation: 1. In tweets, sometimes, one might want to include graphs accurately represented by a short sequence of regular characters, which ideally should also be easy to parse as-is (instead of only after being processed by software such as Mathematica's Graph[...] function.) 2. When labeling/classifying objects, besides using tags which are a flat list of IDs of form {tag1, tag2, ...} and equivalent to a graph with vertexes but no edges, one can use a tree graph represented in a succinct form to carry more information about the hierarchical classification of the object. ## One way: list of tags One obvious way is to write the list of edges such as {1->2, 1->3,2->4, 2->5, 2->6, 3->7, 3->8}, which uniquely identifies the graph. It's not very compact -- vertexes with multiple children is repeated -- and neither very readable -- one need to do much mental processing and memorizing in order to understand and imagine the structure of the graph. ## Another way: "graphlet" Another way I designed is to write it into the following form, which I dubbed "graphlet representation" of (tree) graphs: 1{2{4, 5, 6}, 3{7, 8}}  So, in graphlet representation: 1. The edges in a graph are represented by an edge character (backtick for instance); 2. Child vertexes are grouped by a pair of grouping characters ({ and } for instance). Note that the graphlet representation is shorter than the flat list-of-edges representation. ## Classification graphlet is more informative than list of tags An object's classification is often represented as a list of tags {tag1, tag2, ..., tag8}  However, if the classification is hierarchical, a graphlet representation can easily records more information about the classification structure: tag1{tag2{tag4, tag5, tag6}, tag3{tag7, tag8}}  ### Application examples To represent the classification of a problem, a sequence of key words is often used, e.g. astrophysics, cosmology, general-relativity, star, galaxy  One can actually additionally include its hierarchical classification structure by using a graphlet: science{physics{astrophysics, cosmology, general-relativity}, astronomy{star, galaxy}}  ### Advantages Preserving the full graph structure, many graph characteristics can be exploited, and graph-theoretic methods can be used for analyzing metadata in this form. For example, graphlets can be systematically shorten/simplified by pruning leaves. Posted Tue May 16 23:59:39 2017 In Mathematica and Wolfram Language, with the feature called Texture it's easy to draw 3D solids such as polyhedra with image textures on the surface such as the faces of polyhedra. When playing with this, I drew a rhombic hexecontahedron (properties), the logo of Wolfram|Alpha with national flags on its faces: ☝ Rhombic hexecontahedron wrapped with the 60 most populous countries' national flags, one different flag drawn on each of the 60 rhombic faces. (It's somewhat odd that the most populous countries happen to have a lot of red and green colors in their national flags.) The code is in my Git repository. Remarks: • It was a bit tricky to understand what the option VertexTextureCoordinates is and does, and how to set it up for what I'm trying to draw. It turns out for putting the national flags on the faces of a rhombic hexecontahedron, since both are (usually) isomorphic to a square, VertexTextureCoordinates -> {{0, 0}, {1, 0}, {1, 1}, {0, 1}} is would work as it aligns the four vertices of a rectangular flag to the four vertices of a rhombic face. Something left to be improved: • Avoid flags that's not isomorphic to the face of the given polyhedron; • Compute the suitable values of VertexTextureCoordinates to reduce or even avoid distortion of the flags; • Find a reasonable way to align national flags of shapes not isomorphic to the shape of a face of a given polyhedron, such that the main feature of the national flag wrapped on a polyhedron face is well recognizable. Posted Tue May 16 23:59:39 2017 Wrote a note about backing up MediaWiki including its database (MySQL or SQLite), content pages exported as an XML file, all images, and the entire directory where MediaWiki installed, which includes LocalSettings.php and extensions that usually contain customization. Posted Tue May 16 23:59:39 2017 ## Use rlwrap to enable arrow keys, command history, history completion and search more mma.sh

#!/usr/bin/env bash

rlwrap --file /opt/local/share/rlwrap/MathKernel /Applications/Mathematica.app/Contents/MacOS/MathKernel

$./mma.sh Mathematica 8.0 for macOS x86 (64-bit) Copyright 1988-2011 Wolfram Research, Inc. In[1]:= 1+Sin[2]//N Out[1]= 1.9093 In[2]:=  At an input In[XXX], one can • press up and down arrow keys to insert previously entered commands • press left and right arrow keys to move cursor ## Use XXX to colorise terminal Posted Tue May 16 23:59:39 2017 Getting myself ready to take two courses this fall, online. This will actually be my first experience of "distance education", or the closest thing to it. The two courses are I'm taking them because i am interested in the subjects and hopefully can put the theory and knowledge I learned into real use. Posted Tue May 16 23:59:39 2017 Was trying to come up with some style guide for commit messages in revision control systems (Subversion, Git, Mercurial, etc.) as very commonly it is a place where people tend to use poor or poorly-managed writing styles when they are in a hurry, or not. The contrast between the sophistication of engineering of automatic revision control systems and the lousiness of majority of commit messages I've seen is ironic, at least to me. Imagine a situation where one wants to data mine the commit message history in a revision control repository and do text analytics with it to find out some interesting thing about the project, it can be very useful for the commit messages to have a uniform format and hence more machine processable. ## Some general rule for revisions and commits • try to make a commit self-contained, i.e. logically-closely-related revisions should be committed together in one commit rather than in multiple separate commits • try to make logically-unrelated revisions in their own separate commits rather than make a big revision including many changes that are logically independent to each other ## Some general rules for having uniform format of the commit messages • Use a short summary sentence at the top to summarize the main purpose/content of the revision. As both Git and Mercurial use the first sentence ending with period (.) for various purposes. If there are itemized details in the following, use ellipsis (...) instead. • Leave a blank line below the summary sentence. • Keep the lines shorter than 72 characters as that produces better line wrapping behavior for hg log and git log. • If necessary, use bullet items below the blank line to describe details. Use different bullet for different types of revisions • + added new content • - removed existing content • % modified existing content and/or behavior • ! fixed bugs and errors • ~ tweaked content by making small and/or insubstantial changes • * misc and/or other unclassified changes • > to-do for future • Try to use expressions with simple past tense or noun phrases. Example commit message to a revision control repository: Changed the logic dealing with magic input. + added an awesome new functionality - removed obsolete content % modified some existing code ! fixed a bad bug ~ tweaked some content * some misc unclassified changes > make the logic more robust  Posted Tue May 16 23:59:39 2017 "I read a study that measured the efficiency of locomotion for various species on the planet. The condor used the least energy to move a kilometer. Humans came in with a rather unimpressive showing about a third of the way down the list....That didn't look so good, but then someone at Scientific American had the insight to test the efficiency of lomotion for a man on a bicycle and a man on a bicycle blew the condor away. That's what a computer is to me: the computer is the most remarkable tool that we've ever come up with. It's the equivalent of a bicycle for our minds." Posted Tue May 16 23:59:39 2017 The prestigious scientific magazine Nature published a news report titled 'Oympic feats raise suspicion' by Ewen Callaway (@ewencallaway), which talked about 16-year-old Chinese Olympic gold medalist swimmer Ye Shiwen, and says ... how an athlete's performance history and the limits of human physiology could be used to catch dopers ... Was Ye’s performance anomalous? Yes. Her time in the 400 IM was more than 7 seconds faster than her time in the same event at a major meet in July 2011. But what really raised eyebrows was her showing in the last 50 metres, which she swam faster than US swimmer Ryan Lochte did when he won gold in the men’s 400 IM on Saturday, with the second-fastest time ever for that event. ... Doesn't a clean drug test during competition rule out the possibility of doping? No, says Ross Tucker, ... Other than this fact-lacking and logically flawed weak reasoning, there is not much data or scientific analysis in this report. It seems a rather substandard article for Nature, especially on a potentially controversial topic. One of the reader comments was particularly well-ground criticism on it: Lai Jiang said: It is a shame to see Nature, which nearly all scientists, including myself, regard as the one of the most prestigious and influential physical science magazines to publish a thinly-veiled biased article like this. Granted, this is not a peer-reviewed scientific article and did not go through the scrutiny of picking referees. But to serve as a channel for the general populous to be in touch with and appreciate sciences, the authors and editors should at least present the readers with facts within proper context , which they failed to do blatantly. First, to compare a player's performance increase, the author used Ye's 400m IM time and her performance at the World championship 2011, which are 4:28. 43 and 4:35.15 respectively, and reached the conclusion that she has got an "anomalous" increase by ~7 sec (6.72 sec). In fact she's previous personal best was 4:33.79 at Asian Games 2010 [1]. This leads to a 5.38 sec increase. In a sport event that 0.1 sec can be the difference between the gold and silver medal, I see no reason that 5.38 sec can be treated as 7 sec. Second, as previously pointed out, Ye is only 16 years old and her body is still developing. Bettering oneself by 5 sec over two years may seem impossible for an adult swimmer, but certainly happens among youngsters. Ian Thorpe's interview revealed that his 400m freestyle time increased 5 sec between the age of 15 and 16 [2]. For regular people including the author it may be hard to imagine what an elite swimmer can achieve as he or she matures, combined with scientific and persistent training. But jumping to a conclusion that it is "anomalous" based on "Oh that's so tough I can not imagine it is real" is hardly sound. Third, to compare Ryan Lochte's last 50m to Ye's is a textbook example of what we call to cherry pick your data. Yes, Lochte is slower than Ye in the last 50m, but (as pointed out by Zhenxi) Lochte has a huge lead in the first 300m so that he chose to not push himself too hard to conserve energy for latter events (whether this conforms to the Olympic spirit and the "use one' s best efforts to win a match" requirement that the BWF has recently invoked to disqualify four badminton pairs is another topic worth discussing, probably not in Nature, though). On the contrary, Ye is trailing behind after the first 300m and relies on freestyle, which she has an edge, to win the game. Failing to mention this strategic difference, as well as the fact that Lochte is 23.25 sec faster (4:05.18) over all than Ye creates the illusion that a woman swam faster than the best man in the same sport, which sounds impossible. Put aside the gender argument, I believe this is still a leading question that implies the reader that something fishy is going on. Fourth, another example of cherry picking. In the same event there are four male swimmers that swam faster than both Lochter (29.10 sec) [3] and Ye (28.93 sec) [4]: Hagino (28.52 sec), Phelps (28.44 sec), Horihata (27.87 sec) and Fraser-Holmes (28.35 sec). As it turns out if we are just talking about the last 50m in a 400m IM, Lochter would not have been the example to use if I were the author. What kind of scientific rigorousness that author is trying to demonstrate here? Is it logical that if Lochter is the champion, we should assume he leads in every split? That would be a terrible way to teach the public how science works. Fifth, which is the one I oppose the most. The author quotes Tucks and implies that a drug test can not rule out the possibility of doping. Is this kind of agnosticism what Nature really wants to educate its readers? By that standard I estimate that at least half of the peer-reviewed scientific papers in Nature should be retracted. How can one convince the editors and reviewers that their proposed theory works for every possible case? One cannot. One chooses to apply the theory to typical examples and demonstrate that in (hopefully) all scenarios considered the theory works to a degree, and that should warrant a publication, until a counterexample is found. I could imagine that the author has a skeptical mind which is critical to scientific thinking, but that would be put into better use if he can write a real peer-reviewed paper that discusses the odds of Ye doping on a highly advanced non-detectable drug that the Chinese has come up within the last 4 years (they obviously did not have it in Beijing, otherwise why not to use it and woo the audience at home?), based on data and rational derivation. This paper, however, can be interpreted as saying that all athletes are doping, and the authorities are just not good enough to catch them. That may be true, logically, but definitely will not make the case if there is ever a hearing by FINA to determine if Ye has doped. To ask the question that if it is possible to false negative in a drug test looks like a rigged question to me. Of course it is, other than the drug that the test is not designed to detect, anyone who has taken Quantum 101 will tell you that everything is probabilistic in nature, and there is a probability for the drug in an athlete's system to tunnel out right at the moment of the test. A slight change as it may be, should we disregard all test results because of it? Let's be practical and reasonable. And accept WADA is competent at its job. Her urine sample is stored for 8 years following the contest for future testing as technology advances. Innocent until proven guilty, shouldn't it be? Sixth, and the last point I would like to make, is that the out-of- competition drug test is already in effect, which the author failed to mention. Per WADA president's press release [5], drug testing for olympians began at least 6 months prior to the opening of the London Olympic. Furthermore there are 107 athletes who are banned from this Olympic for doping. That maybe the reason that everyone will pass at the Olympic games. Hardly anyone fails in competition testing? Because those who did dope are already sanctioned? The author is free to suggest that a player could have doped beforehand and fool the test at the game, but this possibility certainly is ruled out for Ye. Over all, even though the author did not falsify any data, he did ( intentionally or not) cherry pick data that is far too suggestive to be fair and unbiased, in my view. If you want to cover a story of a suspected doping from a scientific point of view, be impartial and provide all the facts for the reader to judge. You are entitled to your interpretation of the facts, and the expression thereof in your piece, explicitly or otherwise , but only showing evidences which favor your argument is hardly good science or journalism. Such an article in a journal like Nature is not an appropriate example of how scientific research or report should be done. 1 http://www.fina.org/H2O/index.php?option=com_wrapper&view=wrapper&Itemid=1241 2 http://www.youtube.com/watch?v=8ETPUKlOwV4 3 http://www.london2012.com/swimming/event/men-400m-individual-medley/phase=swm054100/index.html 4 http://www.london2012.com/swimming/event/women-400m-individual-medley/phase=sww054100/index.html 5 http://playtrue.wada-ama.org/news/wada-presidents-addresses-london-2012-press-conference/?utm_source=rss&utm_medium=rss&utm_campaign=wada-presidents-addresses-london-2012-press-conference How well said! Posted Tue May 16 23:59:39 2017 推荐 NHK 的纪录片《圆的战争》。“圆的战争”指的是以操控通货为主的经济战，金融战。日本侵华战争时期通过傀儡国银行（朝鲜银行，中国联合准备银行等）和日本的横滨正金银行勾结，在中国发行实质上无准备金的货币（如“银联券”）扰乱中国经济。日本用这样的货币直接供给在华日军军费，而非直接用日元，而后傀儡银行的日元准备金如数归还日本国库，借此能达到循环不断、无成本、无限地供给军费给侵华日军，“以战养战“的目的。有了这样来的军费支持，侵华日军才能维持长期侵略战争。片中称此手段为“往来借款”： ☝ 示意图：日本以“往来借款”手段支持侵华日军军费 圆的战争 from kamonka on Vimeo. ## 外部链接 Posted Tue May 16 23:59:39 2017 ## Find the top committers to a file ### Mercurial: $ hg log ./.hgignore | grep user: | sort | uniq -c | sort -r | head -3
5 user:        Michael <michael@foomail.com>
3 user:        Mary <mary@barmail.com>


$git log ./.gitignore | grep ^Author: | sort | uniq -c | sort -r | head -3 6 Author: Michael <michael@foomail.com> 2 Author: Mary <mary@foomail.com>  ### CVS: $ cvs log ./foo/bar.cpp | grep author | awk '{print \$5}' | sort | uniq -c | sort -r | head -3
92 Michael;
72 Mary;
`
Posted Tue May 16 23:59:39 2017

I recently discovered a tool for checking Java code formatting and other coding style related issues -- CheckStyle. It features a list of style checks such as

...
* Restricts nested boolean operators (&&, ||, &, | and ^) to a specified depth (default = 3).
* Checks for empty blocks.
* Checks for long source files.
...

There's also an Eclipse plugin for it: https://checkstyle.org/eclipse-cs/#!/. The Eclispe update URL is http://eclipse-cs.sf.net/update/. As of at least Eclipse 4.8.0 (Aug., 2018), this update URL no longer works.

There is a caveat at the time I install it:

the plugin's homepage http://eclipse-cs.sourceforge.net/ gives http://sevntu-checkstyle.github.com/sevntu.checkstyle/update-site/ as the installation URL, but it does not work. However on http://eclipse-cs.sourceforge.net/downloads.html a different installation URL http://eclipse-cs.sf.net/update/ is given, which turns out to work correctly. It seems someone has not updated the Web pages consistently. And it has certainly confused some: http://stackoverflow.com/questions/11865025/error-while-installing-check-style-plugin-for-eclipse.

Posted Tue May 16 23:59:39 2017

Some raw thoughts about human society's high and accelerating information metabolism.

With mobile phones (~1950-), databases (~1960-), personal computers and computer softwares (~1970-), emails (~1970-), the Web (~1990-), social networks such as blogs, Facebook, Twitter, ... (~2000-), and the latest buzz concepts of computing cloud and big data, human society has undergone an exponential growth in its appetite for data and information, as well as its industrial power of producing, transforming, and distributing data and information. It's mind boggling to think about how the way human deals with data, this artificial raw material, is constantly shaping our current moment of history. If one thinks of human society as a complex physical system; the history is the time evolution of the system. Much of the important events happened in the history that history books happened to have a record for, and -- as a result -- get remembered by the recent generations, had been direct consequences of physical constraints on the system. Limitations tend to introduces complex mechanics and then makes features of the system; unboundedness tend to make a system flat and simple. A good portion of those constraints can be attributed to the physical limits on the volume and speed that human society could produce, transform, understand, and distribute information. Data, information, and knowledge are in order of increasing extent of extraction, abstraction, and purification, in terms of human perception and thinking. But in evolution of society, it is almost exactly the opposite order. Three hundred years ago, there might not be any good way -- and hence any interest -- in doing anything with data, as human were simply not armed with efficient tools to collect data. Good first-hand information about the real world were scarce and normally didn't spread far. Emperor of China then didn't have a good idea of the existence of Europe. And much imagination can be given to what modern history could be different if it was not the case. As electricity and electromagnetic waves were utilized for transforming and distributing information, telegraphs and telephones were inevitably invented, events happening at one place could be learned by people at a different place thousands of miles away shortly after. This means the information metabolism of human society has started accelerating. It's just like human with high metabolic rate have higher heart rate. And like appetite for food, when one eats fast, he is more likely to have need for more food. Since then, human society has observed an accelerating appetite and metabolism of data and information. In the present decade, it is no longer uncommon for an average person to find himself knowing about some trivia about some notable person on the other side of the planet. Information, good or bad, useful or not, is now like air, filling any empty space it can find, seemingly by its own force and law of nature. If the society is a biological creature, its appetite and metabolism for information has certainly grown larger and faster and is acceleratingly growing. Given the analogue to biological metabolism, my next question is: is high information metabolic rate good for us? Are humans designed to live like that? How do you live a healthy life style in terms of information metabolism measures?

Posted Tue May 16 23:59:39 2017

Hello world!

Posted Tue May 16 23:59:39 2017

## English Spelling Alphabet v1.0

• A: Apple
• B: Boy
• C: Cat
• D: Dog
• E: Elephant
• F: Flower
• G: Girl
• H: Hotel
• I: India
• J: John
• K: Kiss
• L: Lincoln
• M: Mary
• N: November
• O: Orange
• P: Peter
• Q: Queen
• R: Roger
• S: Sugar
• T: Tom
• U: Umbrella
• V: Victory
• W: William
• X: X-ray
• Y: Yellow
• Z: Zebra
Posted Tue May 16 23:59:39 2017