Democratic Underground Latest Greatest Lobby Journals Search Options Help Login
Google

'Historic' day as first non-latin web addresses go live

Printer-friendly format Printer-friendly format
Printer-friendly format Email this thread to a friend
Printer-friendly format Bookmark this thread
This topic is archived.
Home » Discuss » Latest Breaking News Donate to DU
 
dipsydoodle Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 08:38 AM
Original message
'Historic' day as first non-latin web addresses go live
Source: BBC News

Arab nations are leading a "historic" charge to make the world wide web live up to its name.

Net regulator Icann has switched on a system that allows full web addresses to contain no Latin characters.

Egypt, Saudi Arabia and the United Arab Emirates are the first countries to have so-called "country codes" written in Arabic scripts.

The move is the first step to allow web addresses in many scripts including Chinese, Thai and Tamil.

Read more: http://news.bbc.co.uk/1/hi/technology/10100108.stm
Printer Friendly | Permalink |  | Top
Speck Tater Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 10:12 AM
Response to Original message
1. The tower of Babel all over again. Now there will be millions of web sites we can't visit
because we don't have Arabic, Chinese, Thai, or Tamil keyboards.

A small step for an IP address, but a giant leap backwards into mutual incomprehensibility for mankind.
Printer Friendly | Permalink |  | Top
 
muriel_volestrangler Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 10:49 AM
Response to Reply #1
2. It means you can't type the address in, but how often do you?
Most URLS are accessed as hyperlinks, or perhaps pasted in. I'll try an experiment: the Egpytian Ministry pasted from my clipboard - we'll see what the DU software does with it:

http://موقع.وزارة-الأتصالات.مصر/ar/default.aspx

and inside the DU 'link' format:

Egpytian website

Does that display something meaningful for you? For me, the DU software has changed the 1st one so it's not the full URL, but the first part of it goes to a placeholder .com site that involves a non-Latin character (I presume it's Arabic, but I don't know). The 2nd works OK.

(The URL came from the USA Today article on this - try it if the DU stuff doesn't work for you: http://content.usatoday.com/communities/ondeadline/post/2010/05/3-arab-countries-score-internets-first-use-of-non-latin-letters-in-country-code-domain/1?csp=34 )
Printer Friendly | Permalink |  | Top
 
kirby Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 10:59 AM
Response to Reply #2
3. Useless...
The second link worked, but I couldn't even read the page!

:evilgrin:
Printer Friendly | Permalink |  | Top
 
muriel_volestrangler Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 11:56 AM
Response to Reply #3
6. Heh - that made me wonder if Google Translate can cope with it yet
and it can't. And the Yahoo (was AltaVista) Babel Fish translator doesn't even have an option for Arabic.
Printer Friendly | Permalink |  | Top
 
NYC Liberal Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 11:04 AM
Response to Reply #2
4. But how am I going to type in such a URL?
Edited on Thu May-06-10 11:04 AM by NYC Liberal
It's one think if I'm just linking to a specific URL that I found on a page (I can copy and paste), but what if I'm just trying to write it, for example here in this post. I have to go find a page that links to it to get a hyperlink to copy -- oops but how do I search for it?
Printer Friendly | Permalink |  | Top
 
muriel_volestrangler Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 11:49 AM
Response to Reply #4
5. But that doesn't mean you can't visit the site
which was the objection originally given. Anyway, I put most of the URLs in my DU posts (or other times when I'm recording a URL) using copy and paste already. You don't have to find a page that links to it to copy the URL to the clipboard - you can copy directly from the URL box in your browser.

Printer Friendly | Permalink |  | Top
 
piedmont Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 12:06 PM
Response to Reply #5
9. But you do have to find a page that links to it if you yourself can't type in the address.
Sure, you can copy the URL from the address bar-- if you can get to the page to begin with, which means you need a link if you can't type in the address directly.

I don't really have any strong feelings either way, but it will cause some inconvenience. Usually on the internet that's just an opportunity for someone else to come along and create something useful.
Printer Friendly | Permalink |  | Top
 
muriel_volestrangler Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 12:50 PM
Response to Reply #9
11. How often do you know a website address if you haven't got it on a computer?
If a print or broadcast medium tells you, I guess. Or someone tells you face to face. So, for those occasions, you've have to use tinyurl or similar (such services are already spreading rapidly for Twitter, and would already be useful for any foreign site name that would be syllables you don't understand, and might misspell. Think of telling someone to go to 'gaddafi.com' - or would that be 'qaddafi.com' etc.)
Printer Friendly | Permalink |  | Top
 
piedmont Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 01:01 PM
Response to Reply #11
12. I give out my web address all the time by telephone, in person, on business cards, etc.
But my website is easy to spell, for that very reason.
Printer Friendly | Permalink |  | Top
 
Recursion Donating Member (1000+ posts) Send PM | Profile | Ignore Fri May-07-10 02:57 PM
Response to Reply #2
27. ".msr"
I'm going to blame that on DU's link-scrubbing; when I paste that into firefox's address bar it works fine.

(Incidentally, .مصر is the TLD in question; those three Arabic letters are the root consonants in the Arabic word for Egypt, "Misri")
Printer Friendly | Permalink |  | Top
 
Book Lover Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 11:59 AM
Response to Reply #1
7. You don't visit them now, since you can't read those languages
So what's your beef?
Printer Friendly | Permalink |  | Top
 
Retrograde Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 12:05 PM
Response to Reply #1
8. They're all just ones and zeroes anyway
That's all the machines at the bottom of the process understand: the overlying languages are just for the convenience of the humans. It may be a barrier for those who don't read Arabic, but it could be a boon for people just learning to read their own language, be it Arabic or whatever non-roman-alphabet-using once that gets the next addresses.

Printer Friendly | Permalink |  | Top
 
Posteritatis Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 01:13 PM
Response to Reply #1
13. Newsflash: much of the Internet is not in English. (nt)
Printer Friendly | Permalink |  | Top
 
Blue_Tires Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 01:59 PM
Response to Reply #13
15. once the internets tolerated Klingon and l33t-speak, it was all over
Anyone coming to OUR internets should speak the English, or they should GO HOME to...to....:silly: :crazy: :yoiks:
Printer Friendly | Permalink |  | Top
 
JackRiddler Donating Member (1000+ posts) Send PM | Profile | Ignore Fri May-07-10 03:22 PM
Response to Reply #1
29. Nonsense.
If you couldn't read Arabic before, you still can't today.

And as the first reply to you says: Sites are almost always found by linking, not by typing in URLs.
Printer Friendly | Permalink |  | Top
 
quakerboy Donating Member (1000+ posts) Send PM | Profile | Ignore Sat May-08-10 04:09 AM
Response to Reply #1
35. Ah... But if you desire you can get those keyboards
Or just get overlays and configuration setups to use them. Ive seen it done by language teachers. Easier with shared latin symbols, as you dont have to change as much, but no reason it cant be done with non latin symbols.
Printer Friendly | Permalink |  | Top
 
Chicago dyke Donating Member (127 posts) Send PM | Profile | Ignore Sat May-08-10 06:00 AM
Response to Reply #1
36. i admit i enjoy english language websites
but let's admit: it's totally imperialist and unfair to expect native Tamil or Arabic speakers to learn english just to enjoy the internet. what would you say to a poor Tamil boy who goes to a school with a dirt floor and has access to a computer for an hour a day? "Suck it up, kid!" that's basically what you're saying when you say this isn't a good move.

americans are very lazy compared to most of the rest of the world. so few of us speak more than two languages. that's not the case in a lot of countries, but people around the world also deserve to ease and comfort of native language websites.
Printer Friendly | Permalink |  | Top
 
X_Digger Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 12:17 PM
Response to Original message
10. As a web dev, makes it a bit harder to parse URLs..
.. but I assume there'll need to be a better algorithm written and passed around.

Overall, no biggie to me.
Printer Friendly | Permalink |  | Top
 
ManiacJoe Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 09:56 PM
Response to Reply #10
17. Should be the same algorithm.
Unicode chars allow for a huge range of character values, well beyond the 65-126 for A-Za-z. Chars are chars.

The big problem is making sure the older servers are upgraded to understand unicode in the first place.
Printer Friendly | Permalink |  | Top
 
X_Digger Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 10:56 PM
Response to Reply #17
19. well yah, that too..
.. but my standard toolkit that I've compiled and added to over the years has a bunch of regex's in different languages that I use for things like url validation, email addy validation, etc. Most of them are "hard coded" more or less (depending on language)- ie *?*?<0-9>*? or somesuch.

I know I've got a couple of databases that aren't set up for unicode, going to be a bit of a pain to convert.
Printer Friendly | Permalink |  | Top
 
boppers Donating Member (1000+ posts) Send PM | Profile | Ignore Fri May-07-10 11:23 PM
Response to Reply #19
31. Your email validating regex is wrong, anyways. I'm 100% certain of it.
Minor pet peeve of mine is how many of these I've had to fix over the years... check this bad boy out:

http://www.ex-parrot.com/pdw/Mail-RFC822-Address.html

But you do make a good point, and it's not just your code, it's all the languages and libraries "upstream" that need to be fixed as well. I know the PHP team's been working on i18n/PHP6 for over five years, and still isn't done, because of the libraries that PHP uses. Well, that, and politics/bickering.

On the plus side, ruby has been all over this from day one, Perl seems reasonably covered (CPAN, OTOH...ugh), I have no idea on python, java, or the whole .NET cluster.
Printer Friendly | Permalink |  | Top
 
X_Digger Donating Member (1000+ posts) Send PM | Profile | Ignore Sat May-08-10 10:04 PM
Response to Reply #31
38. DU munged it (forgot it doesn't like brackets).
Edited on Sat May-08-10 10:08 PM by X_Digger
eta: and it's not only the languages / packages themselves, but also common code that's been written and passed around, and the assumptions that a coder made at some point.

Reminds me of the days when mailer.cgi was a popular bit of code. It wasn't distributed in any orderly fashion, so when someone found an exploit with it, there was no way to know who had it and who needed to fix it.
Printer Friendly | Permalink |  | Top
 
Recursion Donating Member (1000+ posts) Send PM | Profile | Ignore Fri May-07-10 02:55 PM
Response to Reply #10
26. There's a Georgian character that looks almost exactly like a lower-case "h"
But it has a different code point. So all somebody has to do is register a URL that looks like hsbcbank.com and start the phishing. There was a lot of concern about this a few years ago; IIRC the solution was to convince Verisign, Thawte, etc. not to sign certs for mixed-codepage domains.
Printer Friendly | Permalink |  | Top
 
slackmaster Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 01:15 PM
Response to Original message
14. This could be worse than Y2K
:scared:
Printer Friendly | Permalink |  | Top
 
No Elephants Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 02:07 PM
Response to Original message
16.  I thought the Pope was going to release an official pronouncement over the web in the vernacular.
That'll teach me to try to figure out a story from the headline.
Printer Friendly | Permalink |  | Top
 
bamacrat Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 10:43 PM
Response to Original message
18. Do Islamic computers type right to left?
I have always wondered that. If not do they have to type backwards or I guess their must be a translator. But that has to be annoying. No wonder they hate us. hahah.
Printer Friendly | Permalink |  | Top
 
Igel Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 11:29 PM
Response to Reply #18
21. "Islamic" computers might be in Malay or Indonesia.
They go from left to right, since they use Latin.

The point is that computers aren't "Islamic" or not. Their software can be for a right-to-left or left-to-right language. Even Arab Xians writing in Arabic need right-to-left software.

My computer, running Windows XP Professional, has only a slight problem with right-to-left encoding. Word XP (2002) handles it. It's right justified and reads right to left. Handy for Arabic and Persian. In fact, there's a little button on a toolbar; I haven't needed it for a couple of years, but from time to time I click it and everything's right justified, ragged left.

It's a bit goofy in that it's obviously a kludge, since my computer has English as the default system language and left-to-right as the default text. So when I would type Arabic it would have to sort of do it strangely--it would "feel" like left-to-right, but it would be "fixed" to right-to-left. The problem was that my computer's default was left to right. If I had an Arabic version of Word running on an Arabic copy of Windows, no problem.

I don't know if Macs have an easier time with it or not. Don't know if more up-to-date versions of Windows handles Arabic better or not.
Printer Friendly | Permalink |  | Top
 
Recursion Donating Member (1000+ posts) Send PM | Profile | Ignore Fri May-07-10 02:52 PM
Response to Reply #18
25. It's handled by Unicode in modern operating systems
There's even a unicode character that means "switch from left-to-right to right-to-left or vice versa".

I spent weeks debugging an application I was programming that displayed half of its text backwards, because of that damn character...

But, to answer your question: yes, and yours can too if you enable a RTL ("right-to-left") locale like Arabic, Hebrew, or Farsi. For example, go to العربية (Al Arabiya, a competitor to Al Jazeera) and you'll see all the text going from right to left (you should see it in the link title itself, assuming you have an Arabic font installed on your system).
Printer Friendly | Permalink |  | Top
 
JCMach1 Donating Member (1000+ posts) Send PM | Profile | Ignore Sat May-08-10 01:28 AM
Response to Reply #18
34. Answer is yes... I often get the automatic Arabic version of websites
Edited on Sat May-08-10 01:29 AM by JCMach1
...Google for instance and the search bar is right to left....

But it is not the computer... it's the program, webpage, or piece of code...
Printer Friendly | Permalink |  | Top
 
marshall Donating Member (1000+ posts) Send PM | Profile | Ignore Thu May-06-10 11:17 PM
Response to Original message
20. Will that make it easier for China to police its citizens access to the web?
If they only make non-English character URLs available they could control the information.
Printer Friendly | Permalink |  | Top
 
NuttyFluffers Donating Member (1000+ posts) Send PM | Profile | Ignore Fri May-07-10 02:05 PM
Response to Original message
22. that's actually pretty cool. best of luck to them.
i expect most of East Asia to follow suit soon after. you already have huge swaths of the internet already disconnected from each other by language (and Babelfish translations are woefully inadequate already), might as well finish the job and encourage people to step out of their shells to learn another script.

(glad i already started practicing the Arabic abjad, btw. w/ Japanese, Korean and some Chinese already, all i need is Cyrillic and Sanskrit and i'm pretty good to go)
Printer Friendly | Permalink |  | Top
 
ChromeFoundry Donating Member (1000+ posts) Send PM | Profile | Ignore Fri May-07-10 02:17 PM
Response to Original message
23. Regular Expression validations...
Great, Regular Expression validations of URIs are going to be horrific to write, read and parse.

So, get used to clicking on links that read like this:
%E6%97%A5%E6%9C%AC%E8%AA%9E.%E6%97%A5%E6%9C%AC%E8%AA%9E%E6%97%A5.%E6%9C%AC%E8%AA%9E%E6%97%A5%E6%9C%AC%E8%AA%9E

And hopefully your bank doesn't start using them or you will never know for sure that you are typing your username/password into the correct site.

On a side note, I wonder if Linux, Mac and Windows will support this format in your HOSTS. files for local hostname resolution.


...hold on to your ass, Fred, the Internets are having the flood gates opened for new Trojans to worm their way into your system.
Printer Friendly | Permalink |  | Top
 
hayu_lol Donating Member (1000+ posts) Send PM | Profile | Ignore Fri May-07-10 02:24 PM
Response to Reply #23
24. Had one of these find it's way onto my newsfeed page on...
FB yesterday. Took a bit but finally blocked it.

Pic of a young(25/30) middle eastern guy with a big smirk on his face.
Printer Friendly | Permalink |  | Top
 
Recursion Donating Member (1000+ posts) Send PM | Profile | Ignore Fri May-07-10 03:00 PM
Response to Reply #23
28. I just tried it on Debian Lenny; works fine
I mean, yes, it looks weird in a hosts file to have half of a line displayed right-to-left and the other half left-to-right, but bytes aren't actually in any spacial order, so it doesn't change anything.

Remember, a user node of the Internet isn't even supposed to know what the TLDs are; it just knows how to ask a name server who is authoritative for that TLD.
Printer Friendly | Permalink |  | Top
 
ChromeFoundry Donating Member (1000+ posts) Send PM | Profile | Ignore Fri May-07-10 10:30 PM
Response to Reply #28
30. Good to know it works...
now, just figuring out which ones (the bad ones) to redirect to 127.0.0.1
Printer Friendly | Permalink |  | Top
 
boppers Donating Member (1000+ posts) Send PM | Profile | Ignore Fri May-07-10 11:28 PM
Response to Reply #30
32. We need regex in hosts files...
Surely, somebody has thought of this before?
Printer Friendly | Permalink |  | Top
 
Recursion Donating Member (1000+ posts) Send PM | Profile | Ignore Sat May-08-10 05:02 PM
Response to Reply #32
37. Look at adns
They don't use the host file format but regex matching is part of their client-side capability.
Printer Friendly | Permalink |  | Top
 
Clovis Sangrail Donating Member (1000+ posts) Send PM | Profile | Ignore Fri May-07-10 11:53 PM
Response to Original message
33. bad idea
Aside from the already mentioned problems of not being able to get to one of these sites without already having a link to it (and I *do type in URLs for places reasonably often) -
there is the issue of how these URLs are translating.
I scan links before I click on them - and if they're gibberish - redirects - or in some other way not recognizable as legitimate - I simply don't click them.
There are piles of exploits for IE/Windows that need nothing more than for you to visit a page in order for them to make an attempt at your machine.
I already avoid just about any .cn or .ru sites because those tlds are relative malware havens.

http://xn--4gbrim.xn----rmckbbajlc6dj7bxne2c.xn--wgbh1c does not parse as anything identifiable to me.
I know now .xn--wgbh1c is the tld for egypt but I'm not going to know what .xn--wghr4f or .zx--rsgr1c is.
I'll just avoid those sites - which is no skin off of their (or my) nose - but I'm not going to be alone in my response and that does lead to a restriction in the flow of information.

Printer Friendly | Permalink |  | Top
 
DU AdBot (1000+ posts) Click to send private message to this author Click to view 
this author's profile Click to add 
this author to your buddy list Click to add 
this author to your Ignore list Thu May 02nd 2024, 01:23 PM
Response to Original message
Advertisements [?]
 Top

Home » Discuss » Latest Breaking News Donate to DU

Powered by DCForum+ Version 1.1 Copyright 1997-2002 DCScripts.com
Software has been extensively modified by the DU administrators


Important Notices: By participating on this discussion board, visitors agree to abide by the rules outlined on our Rules page. Messages posted on the Democratic Underground Discussion Forums are the opinions of the individuals who post them, and do not necessarily represent the opinions of Democratic Underground, LLC.

Home  |  Discussion Forums  |  Journals |  Store  |  Donate

About DU  |  Contact Us  |  Privacy Policy

Got a message for Democratic Underground? Click here to send us a message.

© 2001 - 2011 Democratic Underground, LLC