Thread 'Code for boincstats pages'

Message boards : Web interfaces : Code for boincstats pages
Message board moderation

To post messages, you must log in.

AuthorMessage
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 9669 - Posted: 18 Apr 2007, 20:39:10 UTC
Last modified: 18 Apr 2007, 21:09:29 UTC

On cpdn we have a team trying unsuccessfully to make its name, Universität der Bundeswehr München, display properly on its Boincstats page:

http://www.boincstats.com/stats/team_graph.php?pr=cpdn&id=5624

though on cpdn it displays correctly:

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/team_display.php?teamid=5624

Cpdn member Richard Rodway says

'It's definitely UTF-8 that's appearing on the boincstats pages and it looks like the correct (2 byte) UTF-8 sequences are being used. Unfortunately the page is being served as an ISO8859-1 page and as a result the 2 byte sequence is not being interpreted as one character, but as two. This apparently is being done by the server since it's specifically encoding these bytes to appear correctly as 8859-1 characters, so changing the page encoding in the browser will not work! (It's using html entities to render the characters)

I notice that the climateprediction page for that team is also a 8859-1 encoded page, but in this case the correct code values are being used. 'ä' is encoded as the single byte 0xE4 in 8859-1 and this is being used on the cpdn pages.

I don't know how the team name is getting propagated to the boincstats servers, but something in the way has translated that to UTF-8. The encoding for 'ä' in UTF-8 is the 2 byte sequence 0xC3 0xA4. However if you read that as 8859-1 then instead of translating that sequence into the one character U+00E4 (ä) it gets viewed as the 2 8859-1 characters 0xC3 and 0xA4. 0xC3 is a Ã, 0xA4 is a ¤ . The server is reading the UTF-8 sequence, and probably then storing it unchanged in the database. Then whenever a page is generated, it's reading that data from the database and assuming that it is ISO8859-1

To fix the problem you need to make sure that whatever is sending the team names to boincstats is doing so in an encoding that boincstats understands. There's nothing at all wrong with UTF-8, and my preferred solution is for boincstats to use UTF-8 in its webpages and database (or at least some variant of Unicode in the database). Not only would this fix this problem, it'd also allow teams (and names) to use any character. Such as Japanese or Korean characters... Which is quite impossible in 8859-1, there's only 256 characters in that characterset, as opposed to about 1.1 million in Unicode... (although I think only about 150,000 are currently in use).'


The cpdn discussion is here - one sees that none of the usual solutions work:

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/forum_thread.php?id=5409

This problem must affect teams from all projects. Any hope of a solution?


Richard and Mo


ID: 9669 · Report as offensive
ProfileKSMarksPsych
Avatar

Send message
Joined: 30 Oct 05
Posts: 1239
United States
Message 9670 - Posted: 18 Apr 2007, 20:50:42 UTC
Last modified: 18 Apr 2007, 20:53:35 UTC

How about I open a ticket in the new Trac system for this???

(then Rytis will actually have a new bug to work on instead of the stale ones we're porting over from BOINCZilla)

[edit]Or you or the OP can do it...
http://boinc.berkeley.edu/trac
You'll just need to create an account.
[/edit]
Kathryn :o)
ID: 9670 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 9671 - Posted: 18 Apr 2007, 21:04:11 UTC
Last modified: 18 Apr 2007, 21:30:57 UTC

I didn't know about the new Trac system. On attempting to register I'm getting the rather alarming message

There is a problem with this website's security certificate.


The security certificate presented by this website was not issued by a trusted certificate authority.
The security certificate presented by this website was issued for a different website's address.

Security certificate problems may indicate an attempt to fool you or intercept any data you send to the server.
We recommend that you close this webpage and do not continue to this website.
Click here to close this webpage.
Continue to this website (not recommended)


but will continue regardless!

Ticket registered - thanks Kathryn

http://boinc.berkeley.edu/trac/ticket/57


I think there's a bug in the ticket system's own formatting for displaying more than one paragraph of bold. After the first paragraph, bold starts displaying as italic unless you add an extra ' manually. I'll let them get used to it all before I tell them about that as well.....


ID: 9671 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15563
Netherlands
Message 9682 - Posted: 19 Apr 2007, 9:59:26 UTC

One of these days Willy (of Boincstats) will answer here as well.

In the mean time, I checked at KWSN stats and found something weird:

Universität der Bundeswehr München

and

Universität der Bundeswehr München

They both show. So it may not be the XML exports that are at fault.
ID: 9682 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 9689 - Posted: 19 Apr 2007, 17:31:26 UTC

When certain people email me in French from France, their accented vowels also split into 2 signs. But other emails in French display correctly.
ID: 9689 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 9720 - Posted: 20 Apr 2007, 18:57:26 UTC
Last modified: 20 Apr 2007, 19:00:19 UTC

I see that our ticket #57 has been assessed as a major defect and given a high priority!

http://boinc.ssl.berkeley.edu/trac/query

I wonder what the 'Milestone' column means?
ID: 9720 · Report as offensive
ProfileKSMarksPsych
Avatar

Send message
Joined: 30 Oct 05
Posts: 1239
United States
Message 9721 - Posted: 20 Apr 2007, 18:59:52 UTC - in response to Message 9720.  
Last modified: 20 Apr 2007, 19:22:47 UTC

I see that our ticket #57 has been assessed as a major defect and given a high priority!



Well now Rytis has something else to do :)

Not like he isn't busy enough with school and PG and rewriting the forum code.

[edit]
Scratch that... It's being passed around the developers. Looks like Rom assigned it to David
[/edit]
Kathryn :o)
ID: 9721 · Report as offensive
[BOINCstats] Willy

Send message
Joined: 28 Jun 06
Posts: 12
Netherlands
Message 9869 - Posted: 23 Apr 2007, 20:51:17 UTC

I think the problem is in the way CPDN (or other projects) is (are) exporting stats.

In the XML file are these lines:
<team>
 <id>5624</id>
 <type>6</type>
 <name>Universit&#239;&#191;&#189;t der Bundeswehr M&#239;&#191;&#189;nchen</name>
....


Notice the HTML codes. When you put the team name in a html file and view it in a browser it translates to the wrong characters seen on BOINCstats.

The wrong characters are also seen on other stats sites.


BOINCstats | BAM!
ID: 9869 · Report as offensive

Message boards : Web interfaces : Code for boincstats pages

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.