Encoding problem with downloaded file


#1

Hi guys!

I make an application in which I connect to the internet over GPRS and download a txt file from an http address. My main problem is tha character encoding. When trying toread a specific line of the file which is in english I have no problem at all! But when this line is in Greek I get a weird string in which (except numbers, spaces, commas and points) all greek letters are given like this:

for example for the greek sequence of letters 9 Σ.Μπ I get this one: 9 &#χ3A3.Μ&#x3BF.
In the webpage from where I download this file the encoding is UNICODE.
What can I do to get the correct letters so that when sending them to a greek mobile phone via sms, they can be read them correctly!

And another problem is the following: how can I know the size of the file before downloading it with http protocol. I know there is " wip_getFileSize(channel, eventhandler)" function (in the ftp example in wip datasheet) but I cannot use it since httpCreateClient gets NULL for the eventhandler, so there cannot be any handler for the wip_getFileSize(channel, eventhandler) function like in ftp example… How can this be done?

Please somebody help me!

Thank you very much!


#2

There should be a http header that tells you how big the file is you’re going to receive. But, there is no guarantee that that information is provided by the webserver.
Another possibility is to dynammicaly allocate memory as you’re receiving the file.


#3

Not if you’re using “Chunked” encoding.

eg, if its uses “Chunked” encoding…


#4

Not if you’re using “Chunked” encoding.

[quote]
Do you mean that the file size is put in some http header? What kind?

But what about the encoding? How is it possible to read this greek data correctly with UTF-8? Is there something to write in the wip_dataGetfileOpts(), like:

wip_getFileOpts(http, FILE_NAME, http_event, NULL, WIP_COPT_HTTP_HEADER, “Accept”, “text/xml”,“UTF-8 or something…” WIP_COPT_END);

or this is done in another way?

Please somebody help me with this!

Thanks!


#5

I found out that the values in the box below are hexadecimal values for the greek letters.


#6

Does anybody know how can I send greek characters to a mobile phone?

I know there is a AT+CSCS="" command but I tried all possible values (“GSM”,“PCCP437”,“CUSTOM”,“HEX”) but nothing changed… Anybody to help me?

Thanks


#7

a quick search on internet seems to imply that it is totally dependent on your operator. so i suggest you give them a call to ask how you must format the characters to have them delivered as greek characters.
there seems to be no default way to handle this.
(but then again, i may be wrong)


#8

“Does anybody know how can I send greek characters to a mobile phone?”
We send strings on our language using PDU mode. Read AT commands manual and google for it. By the way we send it with external processor, so I don’t really know if PDU mode can be issued with ADL API.


#9

Thanks guys for your help!


#10

Yes, it can. :slight_smile:


#11

So that mean that you are sending Greek characters!

The question, then, is not “how can I send greek characters” - but, rather, “how can I display the greek characters that I have received”

The answer to that, of course, will depend upon the particular display device…


#12

Yes but this was supposed to be the representation of greek characters in UNICODE encoding. It should not display the latin characters properly and the greek characters in this way using UTF-8 in http HEADER. But never mind, I finally used greekklish for compatibility with google API. Thanks anyway!