I have a PHP web application that communicates to Windows Shares via the smbclient binary. My problem was that some characters (particularly accented characters such as é ) were not being shown on the PHP page and were even causing lines of output to be omitted. After ensuring the PHP page was outputting UTF8 I moved on to checking the output at the command-line and also comparing various shell_exec commands. This narrowed the problem to the smbclient command rather than the shell_exec command.
So I started to investigate the charset settings in Samba. Running the following command showed me the default charset settings for Samba:
testparm -v | grep "charset"
dos charset = CP850
unix charset = UTF-8
display charset = LOCALE
This was giving me my unwanted output – see the output below particularly the ‘h h jonny’ line (and compare this to the output in the next image) which should actually be 3 lines but each héllo is cut off at the é :
There is much more details on the charsets used in Samba/Windows here but I found that editing the smb.conf fileand manually setting the display charset to UTFB solved my problem. This despite my linux locale being en_US.UTF-8 :
vi /etc/samba/smb.conf
dos charset = CP850
unix charset = UTF-8
display charset = UTF-8