Recursive file encoding using native2ascii

02.05.2012.

Native2ascii is a Java command tool used for converting files with native-encoded characters (non-Latin 1 and non-Unicode characters) to the ones with Unicode-encoded characters.

As a part of Java JDK package, this converter is an important Java tool used by Java compiler and other Java tools because they are only able to process files which contain ASCII characters and \udddd Unicode code sequences.

Native2ascii syntax

Native2ascii has the following command syntax:

native2ascii [options] [inputfile [outputfile]]

The list of options supports the following:
• encoding encoding_name – specifies the encoding used for conversion. If not specified, the encoding defined in system properties file.encoding is used.
• reverse – perform the reverse operation – convert the Unicode code sequences to native-encoded characters.

Inputfile – reffers to the file which is to be converted. If ommited, the standard input is used for input.
Outputfile – represents the file containing converted text. If ommited, the standard output is used.

Example how to convert file

Here is an example how to use native2ascii and get the file converted. Let’s say we want to convert app.properties file saved in UTF8 encoding stored in C:\Native2ASCII folder. Also, by converting, we’d like to create another file named app_converted.properties and store it in C:\Native2ASCII folder, too.

The command for file conversion is shown in the image below.

The special croatian characters (č,ć,đ,š,ž) are now replaced with Unicode code sequences and such converted content is stored into app_converted.properties file.

More advanced use

Although often used for converting characters in properties files, we may also use native2ascii for converting other file types, such as Java, Javascript, JSP and other Java related file types. To go even further – here is a sample how to do in place conversion of all files of one type in the same folder:

@echo off
set dirname=%1
set filetype=%2
set jpath=c:\Program Files\Java\jdk1.7.0_03\bin\
set currentdir=%CD%
set filename=
echo START encoding...
cd %dirname%
for /f %%i in ('dir /s /b %2') do
 echo Encode %%i
 rem set filename=%%i.tmp
 cd %jpath%
 native2ascii -encoding utf8 %%i %%i.tmp 
 del %%i
 ren %%i.tmp *.
 cd %dirname%
 echo DONE encoding %%i
)
cd %currentdir%
echo done

To run this script, one must respect the following syntax:

encode [sourcedir] [filetype]

Sourcedir is mandatory and represents the directory whose files are to be converted. Filetype defines the extension of the files which are to be converted – i.e. *.java. If ommitted, the changes will be applied to all files in a folder.

On Linux&UNIX systems, a similar functionallity can be achieved using “oneliner” like the following:

find . -name "*.properties" -exec native2ascii {} {} \;