What is Control_M Character and Different ways to remove it


1.What is Control-M Character:


A control-M(^M) is a carriage return(CR or \r).

Ø  In DOS/Windows text files , all lines are typically ends with a CR(Carriage Return)  and LF(Line Feed) i.e. the combination of \r  and \n

Ø  Where as In Unix/LINUX text files , all lines are typically ends with single LF(Line Feed) i.e. with \n.

So if we send the files from Windows to Unix/LINUX using ftp,pscp there should be files in lines ends with CR and LF and If we execute those files in Unix/LINUX you will get error since Unix/LINUX doesn’t understands the Carriage Return(CR’S or \r) characters

These  Carriage Returns(CR'S à \r ) are called Control M (^M) characters.

2.File Transfer from windows to LINUX:

1).I have created a file named sample_win.txt in my windows 7 machine and transferred using pscp to Linux machine as below,

C:\VMWARE\bk>pscp C:\Users\Bharath\Desktop\win\sample_win.txt root@192.168.183.139:/root/machintosh
sample_win.txt            | 0 kB |   0.0 kB/s | ETA: 00:00:00 | 100%

2).I have created another file named sample_unix.txt  in Linux with the same content of windows file as below.

[root@localhost machintosh]# cat > sample_unix.txt
Hi
This is
sample file
[root@localhost machintosh]#

[root@localhost machintosh]# pwd
/root/machintosh

[root@localhost machintosh]# ls -lrt
total 8
-rw-r--r--. 1 root root 27 Oct 13 09:45 sample_win.txt
-rw-r--r--. 1 root root 23 Oct 13 09:47 sample_unix.txt
[root@localhost machintosh]#

3.Difference between windows file and LINUX file

The difference between the windows file and Unix/LINUX  file  can be easily understood with the help of od command as below(od is a octal dump , -b : shows byte value -c : shows character wise ASCII).

In DOS/WINDOWS, all lines end with a CR/LF combination or \r\n.
In UNIX/LINUX, all lines end with a single LF or \n.





4.How to see Control-M Character in a File:

In order to see the non-printable CR(Carriage Return \r) value , you have to use cat with -v option as below.

[root@localhost machintosh]#cat -v sample_win.txt
Hi^M
This is ^M
sample file^M
[root@localhost machintosh]#

You also see the control m character using vi editor too. Sometimes vi editor will not display the Control m characters that time you need to specify the file format = unix as below.
[root@localhost machintosh]# vi sample_win.txt
Hi
This is
sample file
~
~
~
:edit ++ff=unix

[root@localhost machintosh]# vi sample_win.txt
Hi^M
This is ^M
sample file^M
~
~
~
"sample_win.txt" 3L, 27C

5.How to Remove the Control_M Characters(Carriage Returns CR’S or \r ) in Unix/LINUX

There is so many ways there , to remove the control m characters. Below are the some of the best ways to remove Control-m Characters

method 1: Using a simple utility called dos2unix (Provided this dos2unix utility must installed in your UNIX/LINUX machine).

           usage: dos2unix <<file_name>>

[root@localhost machintosh]# dos2unix sample_win.txt
dos2unix: converting file ./sample_win.txt to UNIX format ...

You can also use below command to remove control m from multiples files.

[root@localhost machintosh]# for f in `find . -xdev -type f \( -name "sample_win*.txt" -o -name "sample_win*.doc" \) -ls | awk '{print $NF}'`
> do
> dos2unix $f
> done
dos2unix: converting file ./sample_win1.txt to UNIX format ...
dos2unix: converting file ./sample_win2.txt to UNIX format ...
dos2unix: converting file ./sample_win.txt to UNIX format ...
[root@localhost machintosh]#

method 2: Using stream editor (sed)
i.e. In sed ,  s replaces(substitutes) the regular expression between the first and second slashes (^M) with  the text between the second and third slashes (nothing in this case) and g replace globally (all                                       occurrences) in the file

               usage : sed -i "s/^M//g" <<file_name>>
                       sed -i "s/^M//g" sample_win.txt (or) sed -i "s/\r//g" sample_win.txt
 
You can also use below command to remove control m from multiples files.

[root@localhost machintosh]#for f in `find . -xdev -type f \( -name "sample_win*.txt" -o -name "sample_win*.doc" \)-ls | awk '{print $NF}'`
> do
> sed -i "s/^M//g" $f         ##### you may also use like sed -i "s/\r//g" $f #####
> done

method 3: Using Vi editor .

%s is a basic search and replace command in vi. It tells vi to replace the regular expression between the first and second slashes (^M) with the text between the second and third slashes (nothing in this case). The g at the end directs vi to search and replace globally (all occurrences)

[root@localhost machintosh]# vi sample_win.txt
Hi^M
This is ^M
sample file^M
~
~
~
:%s/^M//g

Method 4: Using Perl

              Usage: perl -i -pe "s/^M//g" <<file_name>>
                     perl -i -pe "s/^M//g" sample_win.txt (or) perl -i -pe "s/\r//g" sample_win.txt


You can also use below command to remove control m from multiples files.

[root@localhost machintosh]#for f in `find . -xdev -type f \( -name "sample_win*.txt" -o -name "sample_win*.doc" \)-ls | awk '{print $NF}'`
> do
> perl -i -pe "s/^M//g" $f                ##### you may also use like    perl -i -pe "s/\r//g" $f   #####
> done

[root@localhost machintosh]#


Comments

Popular posts from this blog

How to check the hardware information in Linux Systems?

Ansible for Devops

All About Amazon Web Services(AWS)