Text converter

Miscellaneous topics on Carnatic music
Post Reply
Lakshman
Posts: 14213
Joined: 10 Feb 2010, 18:52

Text converter

Post by Lakshman »

Is there a program that will convert PDF or Word files from one Indian language to another? I want to convert telugu text into english or dEvanAgari. Thanks.

VK RAMAN
Posts: 5009
Joined: 03 Feb 2010, 00:29

Re: Text converter

Post by VK RAMAN »

I think Microsoft Office 2007 and 2010 Language, has facility for conversion. Some techie from Microsoft should confirm this.

Lakshman
Posts: 14213
Joined: 10 Feb 2010, 18:52

Re: Text converter

Post by Lakshman »

Maybe I did not state the problem very well.
The text is already in PDF format on a website. I can copy the text but it won't get converted to any other language format with MS Word.

cmlover
Posts: 11498
Joined: 02 Feb 2010, 22:36

Re: Text converter

Post by cmlover »

Lakshman
It is the imbeded font problem!
The PDF converter to MS Word works only with plain English texts...

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41

Re: Text converter

Post by arunk »

If the font is unicode (it may not be), you could copy to clipboard and use my transliterator (you can paste and then transl(iterate) to english).

It sort of works :) - but probably not reliably. I didnt plan for this "feature" it sort of came about because of how the transliterator works (I mean I have not spent anytime getting it right)

Arun

cmlover
Posts: 11498
Joined: 02 Feb 2010, 22:36

Re: Text converter

Post by cmlover »

Arun
It is mostly not Unicode!
When I copy it is all greek to me!

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41

Re: Text converter

Post by arunk »

Thats what I expected. If it isnt Unicode it wouldnt work at all with the transliterator.

Arun

Lakshman
Posts: 14213
Joined: 10 Feb 2010, 18:52

Re: Text converter

Post by Lakshman »

The site I was thinking of is this one:

http://www.esnips.com//web/TallapaakaSa ... ButtonBlue

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41

Re: Text converter

Post by arunk »

That looks like scanned - which means each page is stored as an image. You need a OCR software - I am not sure if one exists for telugu.

Arun

Post Reply