Projects
Home     Blog     Install     New Ticket     View Tickets     Browse Source

Ticket #116 (new defect)

Opened 3 months ago

Last modified 3 months ago

"%s" inside of format statements in macruby doesn't work with multibyte characters

Reported by: jordan.breeding@… Owned by: lsansonetti@…
Priority: blocker Milestone: MacRuby 1.0
Component: MacRuby Keywords:
Cc:

Description

this:

printf("straight puts: ") 
puts(p["name"]) 
printf("straight printf: ") 
printf("%s\n", p["name"]) 
printf("string formatted puts: ") 
puts("%s" % [p["name"]]) 
printf("nsstring printf: ") 
printf("%@\n", p["name"]) 
printf("nsstring formatted puts: ") 
puts("%@" % [p["name"]]) 
printf("format using nsstring, then puts: ") 
puts(NSString.stringWithFormat("%@", p["name"]))

will produce this:

straight puts: Surgeon’s Girl 
straight printf: Surgeon’s Girl 
string formatted puts: Surgeon’s Girl 
nsstring printf: Surgeon’s Girl 
nsstring formatted puts: Surgeon’s Girl 
format using nsstring, then puts: Surgeon’s Girl

Change History

Changed 3 months ago by lsansonetti@…

  • milestone set to MacRuby 1.0

Actually, it looks like it's not related to MacRuby.

$ cat t.m
#import <Foundation/Foundation.h>
int main(void)
{
  NSLog(@"test1: %s\n", [@"あいうえお" UTF8String]);
  NSLog(@"test2: %@\n", @"あいうえお");
  return 0;
}
$ gcc t.m -o t -framework Foundation
$ ./t
2008-08-30 01:26:59.491 t[25923:10b] *** _NSAutoreleaseNoPool(): Object 0x1050a0 of class NSCFData autoreleased with no pool in place - just leaking
Stack: (0x961d1cdf 0x960de562 0x960f2c35 0x960f2811)
2008-08-30 01:26:59.493 t[25923:10b] test1: あいうえお
2008-08-30 01:26:59.494 t[25923:10b] test2: あいうえお
$ 

Changed 3 months ago by jordan.breeding@…

as an experiment this morning I compiled the following:

#import <Cocoa/Cocoa.h>
#import <stdio.h>

int main(int argc, char *argv[])
{
    NSString *testString;

    NSLog(@"TEST 1\n");
    testString = @"Jördan";
    NSLog(@"%s", [testString UTF8String]);
    NSLog(@"%@", testString);
    printf("%s\n", [testString UTF8String]);

    NSLog(@"TEST 2\n");
    testString = [NSString stringWithFormat: @"%s", [@"Jördan" UTF8String]];
    NSLog(@"%@", testString);

    NSLog(@"TEST 3\n");
    testString = [NSString stringWithFormat: @"%@", @"Jördan"];
    NSLog(@"%@", testString);

    return(0);
}

with this:

gcc -fobjc-gc -o test test.m -framework Cocoa

here is the output I got:

2008-08-30 16:13:35.977 test[42236:10b] TEST 1
2008-08-30 16:13:35.980 test[42236:10b] Jördan
2008-08-30 16:13:35.981 test[42236:10b] Jördan
Jördan
2008-08-30 16:13:35.982 test[42236:10b] TEST 2
2008-08-30 16:13:35.982 test[42236:10b] Jördan
2008-08-30 16:13:35.983 test[42236:10b] TEST 3
2008-08-30 16:13:35.983 test[42236:10b] Jördan 

So I think that this is a problem only with NSString (not with say printf), and I don't know if it is fixable or not because when digging around here:

http://developer.apple.com/documentation/Cocoa/Conceptual/Strings/Articles/formatSpecifiers.html#//apple_ref/doc/uid/TP40004265

I saw this:

%s Null-terminated array of 8-bit unsigned characters. %s interprets its input in the system encoding rather than, for example, UTF-8.

So other than passing Ruby strings directly using "%@" I don't know if there is a way to specify a format containing %s and have it picked up in something other than ASCII/system encoding.

Note: See TracTickets for help on using tickets.