Bookmark and Share

Some casual bibliophile, dataphile, sinophile, logophile, pluviophile, cartophile, mycophile, tea connoisseur, urban farmer, and amateur naturalist topics ...

RSS:
English blog:http://meng6.net/pages/blog/index.rss, feedburner, subscribe by email
Chinese bloghttp://meng6.net/pages/zh/blog/index.rss

Meng Lu, 2013-7-6

Suppose you want to remove newlines in between the Chinese characters:

南海少年遊俠客,    
詩成嘯傲凌滄州,  
曾因酒醉鞭名馬,
生怕情深累美人。

-- note that the 1st and 2nd Chinese comma actually have two or more white spaces following them -- and change it to a single line

南海少年遊俠客,詩成嘯傲凌滄州,曾因酒醉鞭名馬,生怕情深累美人。

One way to do this is using Emacs.

Use query-replace-regexp

Press M-x, and type query-replace-regexp, or as a shortcut C-M-%;

Type regexp to match:

\([[:nonascii:\]]\) *
 *\([[:nonascii:\]]\)

Note the line break in the regexp need to be typed into the Emacs minibuffer with C-q C-j.

Type regexp to substitute:

\1\2

This means the white space character(s) (if any) and newline character between non-ASCII characters will be removed in the substituted version, so the result is the character on the first line followed by that on the second line.

Use fill-paragraph

  • Set fill-column variable, which controls how wide a line of text can go before line-wrapping to a very large value for the current buffer: C-x f, 10000000

  • Highlight the paragraph you'd like to modify: move cursor to the beginning, hold Shift down and move up and down arrow to extend and decrease the selection;

  • Press M-x, and type fill-paragraph.

This should remove all newline characters in the text. Interestingly, if there are multiple white space characters at the end of lines before the new line character, it will keep one of them:

南海少年遊俠客, 詩成嘯傲凌滄州, 曾因酒醉鞭名馬,生怕情深累美人。

Note there is an additional white space after the 1st and the 2nd .

The single white space character is actually still redundant, that can be corrected by

M-x query-replace-regexp
, *
,
Posted Tue May 16 23:59:39 2017 Tags:

Older posts:

Hello world!
Posted Tue May 16 23:59:39 2017
Information Metabolism of Society
Posted Tue May 16 23:59:39 2017
Installing CheckStyle plugin for Eclipse
Posted Tue May 16 23:59:39 2017
Log statistics for revision control systems
Posted Tue May 16 23:59:39 2017
NHK 纪录片《圆的战争》
Posted Tue May 16 23:59:39 2017
Steve Jobs Quotes
Posted Tue May 16 23:59:39 2017
Taking online courses
Posted Tue May 16 23:59:39 2017
Tweaking Mathematica command-line interface
Posted Tue May 16 23:59:39 2017
Back up MediaWiki
Posted Tue May 16 23:59:39 2017
grep and backslash
Posted Tue May 16 23:59:39 2017
Installing and configuring Java on Mac
Posted Tue May 16 23:59:39 2017
n-squared time complexity is really slow
Posted Tue May 16 23:59:39 2017
Note on setting up Java projects using Gradle
Posted Tue May 16 23:59:39 2017
Notes on running forms
Posted Tue May 16 23:59:39 2017
principle of a single big jump
Posted Tue May 16 23:59:39 2017
Relearning p-value
Posted Tue May 16 23:59:39 2017
Removing newline characters
Posted Tue May 16 23:59:39 2017
Save Kuvva wallpaper images automatically
Posted Tue May 16 23:59:39 2017
How to switch font and theme in Emacs
Posted Tue May 16 23:59:39 2017
Random notes about Unicode
Posted Tue May 16 23:59:39 2017
wolfram-related twitter accounts
Posted Tue May 16 23:59:39 2017
叠字
Posted Tue May 16 23:59:39 2017
美国国债扫盲资料
Posted Tue May 16 23:59:39 2017
钓鱼岛及其部分附属岛屿图
Posted Tue May 16 23:59:39 2017
Compute sample variance of data stream
Posted Tue Oct 6 19:39:59 2015
Leveled logging in Bash
Posted Thu Mar 19 07:49:38 2015
pi day.draft
Posted Mon Mar 9 05:08:15 2015
wolfram-related twitter accounts and the alike
Posted Sat Mar 7 02:37:38 2015
permission of .ssh files
Posted Thu Feb 26 00:32:53 2015
blog comments powered by Disqus