i having little problem utf-8 charset. have utf-8 encoded file witch want load , analyze. using bufferedreader read file line line.
bufferedreader buffreader = new bufferedreader(new inputstreamreader (new fileinputstream(file),"utf-8"));
my problem normals string methods (trim() , equals() example) in java not suitable use line readed bufferreader in every iteration of loop created read content of bufferedreader. example in encoded file have "< menu >" witch want program treat is, seen "?? < m e n u >" mixed others strange characters. want know if there way remove charset codifications , keep plain text can use methods of string class without complications. thank you
if jdk not getting old (1.5) can :
locale frlocale = new locale("fr", "fr"); scanner scanner = new scanner(new fileinputstream(file), "utf-8"); scanner.uselocale(frlocale); (; scanner.hasnextline(); numline++) { line = scanner.nextline(); }
the scanner can use delimiters other whitespace. example reads several items in string:
string input = "1 fish 2 fish red fish blue fish"; scanner s = new scanner(input).usedelimiter("\\s*fish\\s*"); system.out.println(s.nextint()); system.out.println(s.nextint()); system.out.println(s.next()); system.out.println(s.next()); s.close(); prints following output: 1 2 red blue
Comments
Post a Comment