If I take a file with Japanese characters in it, then the output is correct when I tell Pygments what lexer to use, but the characters are mangled when I ask Pygments to guess the lexer.
Using the attached "test.rb" file
cat test.rb | pygmentize -l ruby -O encoding=utf-8
pygmentize -O encoding=utf-8 test.rb
work correctly, while
cat test.rb | pygmentize -g -O encoding=utf-8
results in mangled characters.
This happens in all of the formatters.
Tested on Pygments 1.5 and Pygments 1.6, rc1. It happens on multiple lexers, as well.