Single Transliteration Scheme for all CM Languages - Part 2

Languages used in Carnatic Music & Literature
Post Reply
arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#1

Post by arunk » 27 Jan 2007, 03:39

Dear friends,

I am starting a new thread because the other one has enough posts in it, and i consider this one as a sort of a "new coming" :). Hope mods do not mind.

I have a significant update to the single transliteration scheme project. I have now called the scheme Unified Transliteration Scheme for Carnatic Music Compositions (Yes - quite ingenious indeed ;)!)

There are some major enhancements this time around:

A full-blown editor
Note: This is in addition to the test-bed, which you can still use for basic testing.

1. There is now a full blown editor which you can access from a browser (firefox, IE have been tested with, Safari should work too). Using this you can create sahitya with nice formatting such as bold, italic, headings, colors etc. You can also copy and paste text from a web-page or from Word (although i have not tested copy/paste from Word that much). Once you have your text ready, you click one button and it will render the text in your favorite CM language.
2. You can switch to a printable view of the translation of any particular language and then print it (no "save" support yet)
3. There is support for "Variables" which allows you to specify certain terms (outside of the sahitya portions) that get translated differently for different languages so that it is appropriate for each. There is work to be done here in the sense that the "database" of known variables needs to grow. I have only a handful now. - but hopefully they allow you to get a feel for it. You can get information on how to use these Variables using the online help.

Once you know how to use the editor, a nice way to test it is to go to one of the fine rasikas.org Wiki pages with krithi listing, just select the sahitya portion (including headings), and paste it to the editor. You then ask the editor to "convert suitable terms to variables" and it should do so for raga, composer, pallavi, language, anupallavi etc. Then you translate - you may have to adjust a few things as the input text may not conform to the scheme. This also allows you to see why using variables is very useful.

Note malayalam still needs work and has been disabled in the editor. Hopefully with jayaram's help, i can complete it in the next few weeks. You can still play with malayalam in the test bed page.

The editor will take a few seconds to load. Please bear with it. I think it gets faster on subsequent usages as browser starts to reuse previously downloaded content from the cache, but it isnt going to instantaneous. This is the price we pay for a richer interface. If you have a modem connection, it could be too slow - sorry! But with cable modem connection or faster, it loads fast enough to not be a detriment.

Description of scheme
I have also created web-pages that describe the scheme in detail including all the context specific rules. This includes a overview, plus an interactive legend/index table where you can view the entire "alphabet" of the scheme. It also describes the qualifier schemes for tamil. I would very much like interested people to go over this to make sure things make sense and add up. Any feedback is really appreciated.

Links
You can read information on the transliteration scheme at: http://arunk.freepgs.com/cmtranslit/cmt ... cheme.html . If you notice, the version of the scheme says "1.0 Beta" - that means everything is not cast in stone, and we can still iron out stuff before it "officially" becomes 1.0.
You can directly access the index/legend at: http://arunk.freepgs.com/cmtranslit/legend.html
You can use the editor at: http://arunk.freepgs.com/cmtranslit
You can continue to use the test bed at: http://arunk.freepgs.com/cmtranslit/cmt ... stbed.html (old link also still works)


Hopefully fine looking sanskrit/telugu/tamil/kannada/malayalam content will be created and printed using this. If everybody accepts it, eventually I would like the Carnatic Wiki pages to somehow interface with the translator so that from the Wiki page of a krithi you click a button/link that means "i want to see this in my language" (or on all 5 languages), and it takes you to a page where the translation is already done. I think this is possible with some php stuff - although admin help would be neeeded. Something to think about.

I would like to request your support in evolving the scheme and its uses. Please give me feedback - good or bad. Suggestions are always welcome. Please post feedback on this thread or you can send me email too.

Thanks
Arun
Last edited by arunk on 27 Jan 2007, 03:54, edited 1 time in total.
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#2

Post by arunk » 27 Jan 2007, 03:55

i had some typos in the links when i originally posted and some werent working, i have corrected them now.

Arun
0 x

rshankar
Posts: 13190
Joined: 02 Feb 2010, 22:26
x 449
x 129

#3

Post by rshankar » 27 Jan 2007, 04:07

WOW! Arun,
Nice work! Must have taken you loads of time. Wonderful dedication!
I had this fond hope as I was scrolling through the page that you'd have spelt the death knell for goof ups like panDu rIdi and banduvarALI...but then I realized that this works on the GIGO scheme as well, correct?
0 x

ramakriya
Posts: 1833
Joined: 04 Feb 2010, 02:05

#4

Post by ramakriya » 27 Jan 2007, 04:15

Arun,

This is wonderful.. I will play around when I have some free time.

-Ramakriya
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#5

Post by arunk » 27 Jan 2007, 04:17

thanks ravi. It was a bit time consuming, although the editor is really available for free (its called tinyMCE - that reminds me that i should be acknowledging that). I had to extend it and i chose it for precisely that reason in that it is extensible.

paNDUridi will "look right" as long as you dont turn on qualifiers, but as soon as you do that, the jig could be up :). But then that is if the person inputting knows which is right and which is wrong! It is only as good as what input you provide. You give it #[email protected], it will translate it to #[email protected]!

But if we have "official pages" with the correct pronounciation, and people use the translator, it should help retain the correct pronounciation (again if we use qualifiers which would be a must for non-tamizh krithis being rendered in tamizh).

What is GIGO btw? Sorry - i dont know that term.

Arun
0 x

Suji Ram
Posts: 1529
Joined: 09 Feb 2006, 00:04
x 1

#6

Post by Suji Ram » 27 Jan 2007, 04:23

Awesome Arun,
Absolutely COOL!! Congrats
0 x

rshankar
Posts: 13190
Joined: 02 Feb 2010, 22:26
x 449
x 129

#7

Post by rshankar » 27 Jan 2007, 04:32

GIGO was one of my dad's favvorite terms:
Garbage In = Garbage Out
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#8

Post by arunk » 27 Jan 2007, 04:35

thanks ramakriya, suji. Hopefully you guys will find time to play with it. I will try to setup some examples like last time in the next couple of days. But at this point, we need to make sure that the scheme is ok.

ravi - yep GIGO alright :)

Arun
0 x

ramakriya
Posts: 1833
Joined: 04 Feb 2010, 02:05

#9

Post by ramakriya » 27 Jan 2007, 04:39

Is there a way to save all language versions into a *single* file when I hit transliterate?

-Ramakriya
0 x

Suji Ram
Posts: 1529
Joined: 09 Feb 2006, 00:04
x 1

#10

Post by Suji Ram » 27 Jan 2007, 04:42

Already tested the Editor and it is great. Copied and pasted some kritis and tested and they look cool. Anything from Karnatik.com turns out very funny since the scheme is different there.
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#11

Post by arunk » 27 Jan 2007, 04:43

ramakriya - not yet. Actually saving even one isnt directly supported yet.

But i will make sure to make this possible. I did have that in mind. Even in printable view, i would like to make it possible to view multiple languages (which should make way for the save).

Arun
Last edited by arunk on 27 Jan 2007, 04:45, edited 1 time in total.
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#12

Post by arunk » 27 Jan 2007, 04:44

Suji Ram wrote:Already tested the Editor and it is great. Copied and pasted some kritis and tested and they look cool. Anything from Karnatik.com turns out very funny since the scheme is different there.
one possibility for "later on" is to implement "inter-scheme" translations - but that can be a can of worms :).

Arun
0 x

ramakriya
Posts: 1833
Joined: 04 Feb 2010, 02:05

#13

Post by ramakriya » 27 Jan 2007, 04:46

Here is my first trial: What else other than an invocation to ganEsha?

ಗಜವದನ ಬೇಡುವೆ; ಪುರಂದರ ದಾಸರ ರಚನೆ

ಗಜವದನ ಬೇಡುವೆ ಗೌರೀತನಯ
ತ್ರಿಜಗವಂದಿತನೆ ಸುಜನರ ಪೊರೆವನೆ ||ಪಲ್ಲವಿ||

ಪಾಶಾಂಕುಶಧರ ಪರಮ ಪವಿತ್ರ
ಮೂಷಕ ವಾಹನ ಮುನಿಜನ ಪ್ರೇಮ || ಅನುಪಲ್ಲವಿ||

ಮೋದದಿ ನಿನ್ನಯ ಪಾದವ ತೋರೋ
ಸಾಧು ವಂದಿತನೇ ಆದರದಿಂದಲಿ || ಚರಣ 1||

ಸರಸಿಜನಾಭ ಶ್ರೀ ಪುರಂದರ ವಿಠಲನ
ನಿರುತ ನೆನೆಯುವಂತೆ ದಯಮಾಡೋ || ಚರಣ 2||


गजवदन बेडुवॆ; पुरंदर दासर रचनॆ

गजवदन बेडुवॆ गौरीतनय
त्रिजगवंदितनॆ सुजनर पॊरॆवनॆ ||पल्लवि||

पाशांकुशधर परम पवित्र
मूषक वाहन मुनिजन प्रेम || अनुपल्लवि||

मोददि निन्नय पादव तोरो
साधु वंदितने आदरदिंदलि || चरण 1||

सरसिजनाभ श्री पुरंदर विठलन
निरुत नॆनॆयुवंतॆ दयमाडो || चरण 2||

గజవదన బేడువె; పురందర దాసర రచనె

గజవదన బేడువె గౌరీతనయ
త్రిజగవందితనె సుజనర పొరెవనె ||పల్లవి||

పాశాంకుశధర పరమ పవిత్ర
మూషక వాహన మునిజన ప్రేమ || అనుపల్లవి||

మోదది నిన్నయ పాదవ తోరో
సాధు వందితనే ఆదరదిందలి || చరణ 1||

సరసిజనాభ శ్రీ పురందర విఠలన
నిరుత నెనెయువంతె దయమాడో || చరణ 2||

கஜவதன பேடுவெ; புரந்தர தாசர ரசனெ

கஜவதன பேடுவெ கௌரீதனய
த்ரிஜகவந்திதனெ சுஜனர பொரெவனெ ||பல்லவி||

பாஸாங்குஸதர பரம பவித்ர
மூஷக வாஹன முனிஜன ப்ரேம || அனுபல்லவி||

மோததி நின்னய பாதவ தோரோ
சாது வந்திதனே ஆதரதிந்தலி || சரண 1||

சரசிஜனாப ஸ்ரீ புரந்தர விடலன
நிருத நெனெயுவன்தெ தயமாடோ || சரண 2||

Very neat Arun!

-Ramakriya
Last edited by ramakriya on 27 Jan 2007, 04:49, edited 1 time in total.
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#14

Post by arunk » 27 Jan 2007, 04:49

you should be able to "convert" each caraNa to a variable (look at online help for how do - basically you place cursor on the word and click the right button).

Then even in tamizh it will come out correctly as caraNam (as opposed to caraNa).

Arun
0 x

Suji Ram
Posts: 1529
Joined: 09 Feb 2006, 00:04
x 1

#15

Post by Suji Ram » 27 Jan 2007, 04:50

Ramakriya,
you did not turn on the qualifiers for tamizh..
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#16

Post by arunk » 27 Jan 2007, 04:53

right - but then there is no universal qualifier scheme. I like the "natural one" (and really detest the other), and i think there are others who may feel vice-versa :)

One more thing i was planning to implement was to way to indicate "include legend in output" - and it puts up :
(i) an indication of the qualifier scheme
(2) and/or a concise form of the legend including ONLY those letters that end up using qualifiers (as opposed to entire legend which would be impractical)

Arun
Last edited by arunk on 27 Jan 2007, 04:54, edited 1 time in total.
0 x

ramakriya
Posts: 1833
Joined: 04 Feb 2010, 02:05

#17

Post by ramakriya » 29 Jan 2007, 23:10

Please take a look at the following wiki page generated by Arun's editor.

http://www.rasikas.org/wiki/sri-mahaganapatim-bhajeham

I have not yet tried using variables etc.

-Ramakriya
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#18

Post by arunk » 30 Jan 2007, 01:48

Nice. But we have some issues to deal with

1. Font sizes arent the same for all - tamizh is too big, sanskrit and telugu too small (for me on my computer). This is of course can be dependent on individual user's browser settings. In mine I believe different fonts got picked and they all have different sizes.

2 tamizh qualifiers arent showing up nicely at all - they arent super-scripted and hence are "taking over the content" and are an eye-sore.

I am guessing #2 is because of copy and paste where some HTML tags got lost - or may be they got lost on entry to the wiki (which while makes things easier, can be quite restrictive if you want to mix html)?. Perhaps an option copy contents to clipboard like testbed should solve this. It may be possible to provide "copy as wiki".

#1, i am not yet sure. One possibility is to explicitly use font sizes (8pt, 10pt etc.) and that may even things out.

Arun
Last edited by arunk on 30 Jan 2007, 01:49, edited 1 time in total.
0 x

ramakriya
Posts: 1833
Joined: 04 Feb 2010, 02:05

#19

Post by ramakriya » 30 Jan 2007, 02:02

Arun,

I too find the same mismatch in the sizes (Tamil v/s Samskrita-kannada etc). I used all default settings.

Is there a paste-special mode where superscripts will remain as such?

One problem I found is that single spaces between words show up as no spaces in the transliterated text!

-Ramakriya
Last edited by ramakriya on 30 Jan 2007, 02:02, edited 1 time in total.
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#20

Post by arunk » 30 Jan 2007, 02:09

ramakriya wrote:Is there a paste-special mode where superscripts will remain as such?
How are you doing this? in wiki you edit and simply paste? Let me know and I will play around
One problem I found is that single spaces between words show up as no spaces in the transliterated text!
Where in the editor itself, or printable view or on paste in wiki?

Arun
0 x

ramakriya
Posts: 1833
Joined: 04 Feb 2010, 02:05

#21

Post by ramakriya » 30 Jan 2007, 02:24

arunk wrote:
ramakriya wrote:Is there a paste-special mode where superscripts will remain as such?
How are you doing this? in wiki you edit and simply paste? Let me know and I will play around
Arun
Yes - Also when I copy-paste, some parts of the samyuktaksharas in Kannada/Telugu don't show up correctly (Look at the example given below - the 'ya' and 'ma' ottu. What to be done for this?
One problem I found is that single spaces between words show up as no spaces in the transliterated text!
Where in the editor itself, or printable view or on paste in wiki?

Arun
In both. And this is happening in Kannada and Telugu only. In SamskRta and Tamil the space shows up correctlt as shown in the following example:

gajAraNyavAraNam jyOtirmayam
गजारण्यवारणम् ज्योतिर्मयम्
గజారణ్యవారణంజ్యోతిర్మయం
ಗಜಾರಣ್ಯವಾರಣಂಜ್ಯೋತಿರ್ಮಯಂ
க3ஜாரண்யவாரணம் ஜ்யோதி1ர்மயம்ಯಂ

-Ramakriya
Last edited by ramakriya on 30 Jan 2007, 02:24, edited 1 time in total.
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#22

Post by arunk » 30 Jan 2007, 02:27

i will look at the copy-paste later when I get a chance.

The space problem, i bet has to do with my logic for the anuswara. I can reproduce it on my computer. So i will try to fix it.

Thanks!
Arun
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#23

Post by arunk » 30 Jan 2007, 03:54

i found the space problem after anuswara. It was a bug in the logic. I do not yet know what is going on with the copy and paste problem. My guess is perhaps the font being selected by the browser when viewing rasikas.org page is not a good one and is buggy?

For example, if I copy the (supposedly) correct kannada text for it from the test-bed, and paste it here in my reply, it changes to something bogus:

ಗಜಾರಣ್ಯವಾರಣಂ ಜ್ಯೋತಿರ್ಮಯಂ

The above is not what I see in the window from which i copied it to clipboard! The Nya, rma etc. are messed up.

Arun
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#24

Post by arunk » 30 Jan 2007, 04:05

i found a clue: Any time you paste into a edit-box or a text-box (like our post submission part, wiki submission) things go wacky.

Even in the test-bed, if i copy the kannada for gajAraNya from the kannada section back into the edit-box (i.e the one under Type in text to translate), it goes wacky!.

This sort of sucks. Unless we find a solution, submitting non-english content via forms has problems.

Arun
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#25

Post by arunk » 30 Jan 2007, 04:18

if i explictly setup the text-boxes to use a unicode font (e.g. Arial Unicode MS), it works. So I think we need to ask srkris to change the submission boxes to use unicode fonts.

i.e. add style="font-family:Arial Unicode MS,Lucida Sans Unicode;" to the input tags did the trick for me. Of course one has to watch out for side-effects.

Arun
0 x

Post Reply