Single Transliteration Scheme for all CM Languages - Part 2

Languages used in Carnatic Music & Literature
Post Reply
arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#51

Post by arunk » 01 Feb 2007, 20:30

mahakavi,

I am not sure but there are a few variables:
1. The email program you used to send it. It should be able to send the content as "unicode". If it supports sending in HTML, i would presume it does
2. Which email program she used to read it. Again it should be able to read it, AND apply the correct font for the tamizh unicode text (browsers nowadays do this automatically). Of course this also depends on if a font that supports tamizh unicode is installed (nowadays this is not that much of a problem).

If you send it from a webmail account in HTML to another webmail account, i would have expected it to work. Let me do a bit of testing and see whats going on.

Arun
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#52

Post by arunk » 01 Feb 2007, 20:45

mahakavi,

i sent mail from my yahoo account to myself (i.e. yahoo) and also another account for which outlook is the reader.

If i send mail but my compose settings say "send as plain text", the mail on receipt does appear in tamizh but words in a single line are split across multiple lines.

After I fix the compose settings to say "send with colors and graphics" (i.e. html/rich-text), it appeared fine on receipt (both yahoo and outlook).

Find out what email program your friend uses and/or send it to a webmail account or an account that uses Outlook as reader.

Arun
Last edited by arunk on 01 Feb 2007, 20:46, edited 1 time in total.
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#53

Post by arunk » 01 Feb 2007, 20:49

i spoke too soon :)

Send as plain text: Looks correct on Outlook, but wrong (in tamizh but words in a single line are split across multiple lines) on Yahoo
Send as HTML: Looks correct on Yahoo, but wrong (not even in tamizh, they show up as HTML uncode entities (e.g. அ for tamizh "a") on Outlook.

So depending on which account you send to, you need to send it differently :). What a wacky world!

Arun
0 x

rshankar
Posts: 13363
Joined: 02 Feb 2010, 22:26
x 582
x 158

#54

Post by rshankar » 01 Feb 2007, 22:12

Arun,
Can you help me to fix the bolded words here:
इक परदेसि मेरा दिल ले गया
जाते जाते मीठा मीठा गम दे गया

आप यूङ् ही अगर हमसे मिलते रहे
देखिये एक दिन प्यार हो जायेगा

आंखोन् से जो उतरी है दिल् मे
(क्या बात है उस परवाने मे)
खुद ढूङ्ढ रही है शम्मा जिसे
क्या बात है उस पर्वाने मे?

Thanks...
0 x

mahakavi
x 1

#55

Post by mahakavi » 01 Feb 2007, 22:13

arunk:
Thanks.
What about hotmail?
I will try it myself soon.
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#56

Post by arunk » 01 Feb 2007, 22:17

ravi,

i am very poor (used to be a zero) in reading devanagiri. What is the transliteration text you entered?

Thanks
Arun
0 x

mahakavi
x 1

#57

Post by mahakavi » 01 Feb 2007, 22:22

Is it "AnkhOne"?
0 x

rshankar
Posts: 13363
Joined: 02 Feb 2010, 22:26
x 582
x 158

#58

Post by rshankar » 01 Feb 2007, 22:25

yU#n - what I wanted to show up was yU with a chandrabindu on top...

A#nkhEn - Need a chandrabindu over A...I figured out the rest of the issues when I played around it with the scheme...
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#59

Post by arunk » 01 Feb 2007, 22:27

deleted
Last edited by arunk on 01 Feb 2007, 22:29, edited 1 time in total.
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#60

Post by arunk » 01 Feb 2007, 22:28

The candrabindu is that teeny weeny dot. Thats how the font renders it.

(unless i am wrong)
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#61

Post by arunk » 01 Feb 2007, 22:29

rshankar wrote:yU#n - what I wanted to show up was yU with a chandrabindu on top...
No support for this. Is this hindi specific or does sanskrit have it too?

Arun
0 x

rshankar
Posts: 13363
Joined: 02 Feb 2010, 22:26
x 582
x 158

#62

Post by rshankar » 01 Feb 2007, 22:30

candrabindu is a dot (the bindu part) inside a quarter circle (the candra part - would look like the parenthesis that ends this part rotated 90 degrees clockwise)...
Also, how do I get a bindu atop the last letter of a word?
0 x

rshankar
Posts: 13363
Joined: 02 Feb 2010, 22:26
x 582
x 158

#63

Post by rshankar » 01 Feb 2007, 22:31

arunk wrote:
rshankar wrote:yU#n - what I wanted to show up was yU with a chandrabindu on top...
No support for this. Is this hindi specific or does sanskrit have it too?

Arun
I think it is Hindi specific...not too sure.
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#64

Post by arunk » 01 Feb 2007, 22:33

Got it. Now for sanskrit, when is candrabindu used vs when is anuswara (the dot) used?

I am generating anuswara here (in all cases), i am wondering if for sanskrit i should generate candrabindu always - or whether it is dependent on context.

Thanks
Arun
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#65

Post by arunk » 01 Feb 2007, 22:36

btw, i have seen ateast one CM book in sanskrit where the anuswara is used for pa#nkaja etc.

Arun
0 x

ramakriya
Posts: 1833
Joined: 04 Feb 2010, 02:05
x 1

#66

Post by ramakriya » 01 Feb 2007, 23:22

arunk,

Seems to be a new bug now:

The word SR.ngAra ( as the rasa) incorrectly transliterates (exept in tamizh) as below

श्र्.ंगार శ్ర్.ంగార ಶ್ರ್.ಂಗಾರ ஸ்2ரு2ங்கார

-Ramakriya
Last edited by ramakriya on 01 Feb 2007, 23:22, edited 1 time in total.
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#67

Post by arunk » 01 Feb 2007, 23:31

i will check. Strangely sR.ng works fine :). Just indicates that it is a bug in handling 'S'. It looks like it is getting confused with the SrI logic.

Arun
0 x

mahakavi
x 1

#68

Post by mahakavi » 02 Feb 2007, 00:04

arunk:
When I send the Thamizh text from roadrunner to roadrunner(myself) email where I use Outlook express, the message reads fine. But hotmail to hotmail or roadrunner it is all a lot of numbers. Hotmail to Yahoo the fidelity of the text is preserved but as you said the text gets split into numerous lines.

When I send from roadrunner to yahoo, or hotmail addresses (using unicode format), it is again mumbo-jumbo stuff different from the previous gibberish numbers.

Well I'll leave it there. Don't bother to resolve it unless you get to resolve it by sheer luck.
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#69

Post by arunk » 02 Feb 2007, 00:14

ramakriya - i uploaded a fix for SR.ngAra.

mahakavi - there is not much we can do. It is not something unique to what I am generating, but is a problem with sending over unicode text. But i am surprised hotmail doesnt handle itself!

Arun
0 x

rshankar
Posts: 13363
Joined: 02 Feb 2010, 22:26
x 582
x 158

#70

Post by rshankar » 02 Feb 2007, 09:46

divakar wrote:rshankar: i wonder how you got the language script.
I used Arun's transliteration program.
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#71

Post by arunk » 02 Feb 2007, 22:12

divakar - pl. see http://arunk.freepgs.com/cmtranslit (and also threads under Languages section here on the forum).

ravi - If hindi uses candrabindu for pa#nkaja (pa~nca), but sanskrit uses anuswara, then only way would be to support hindi as a separate language. This is of course possible but strictly speaking would throw a wrinkle in the nomenclature of the scheme :). But there are some issues. Does hindi use candrabindu for words like mAm (i.e. words ending in "m")? Another wrinkle is words like yU#n may quite difficult to represent in other languages unless we use qualifiers. The trouble is in non-tamizh scripts, successive consonants can be combined into single glyphs, and that makes qualifiers harder atleast in unicode representation.

Also, for words like yU#n is the ending sound really supposed to represent the character #n here?

Sorry if those questions dont make sense. Yet again coming across a language which i dont know well :)

Arun
Last edited by arunk on 02 Feb 2007, 22:13, edited 1 time in total.
0 x

rshankar
Posts: 13363
Joined: 02 Feb 2010, 22:26
x 582
x 158

#72

Post by rshankar » 02 Feb 2007, 23:43

Arun,For the most part Hindi and Sanskrit use the same style for forming words...it is the urdU words that may make Hindi different...for instance, when Om is written in sanskrit, a chandrabindu is used...that is what I mean. For pa~nca, pan#nkaja, a bindu will suffice, but for Ankh, or yUn, a candrabindu is needed. I am not a very 'rules' oriented speller, but instinctively get hindi spelt correct (don't ask me why or how, becuase in all other languages, I have learnt to ignore my instincts to get the spelling right!)...
0 x

drshrikaanth
Posts: 4066
Joined: 26 Mar 2005, 17:01
x 1

#73

Post by drshrikaanth » 03 Feb 2007, 00:08

Ravi
Sanskrit and hindi do not follow the same pattern of writing their words. A bindu will not suffice in sanskrit for words like pa#nkaja, pa~nca. The letters have to be clearly shown in the conjunct.

Hindi is a lot like kannaDa, telugu and many others in that respect. the spellings are simplified.
0 x

arunk
Posts: 3424
Joined: 07 Feb 2010, 21:41
x 2

#74

Post by arunk » 03 Feb 2007, 00:31

drshrikaanth wrote:A bindu will not suffice in sanskrit for words like pa#nkaja, pa~nca. The letters have to be clearly shown in the conjunct.
Then at least one book i have doesnt follow this convention as it is using the bindu. May be it is following hindi rules or perhaps a convention while not kosher is ok to many people (my wife who has learned sanskrit didnt seemed bothered by it)

It is the book on Syama Sastry krithis by Smt. Vidya Sankar (has devanagiri, telugu, tamil and english). I can check other books.

A possibility is to make it an option.

Arun
0 x

drshrikaanth
Posts: 4066
Joined: 26 Mar 2005, 17:01
x 1

#75

Post by drshrikaanth » 03 Feb 2007, 00:47

arunk
IIRC We have discussed this very same issue earlier in the previous thread on transliteration. Surely we dont need a recap?
0 x

Post Reply