18 October 2011

UTF-8 characters import into LDAP (TDS)

Related to previous post is the following problem: how to batch process multiple LDAP entries based on LDIF files, for entries containing UTF-8 characters (like polish specials)?
The way you should deal with them is quite similar, but you need to do one additional step: use Base64 encoding to pass UTF-8s to LDAP.

So, if you find yourself facing the following problem:
  • ldapmodify: no DN specified
  • ldap_add: Invalid DN syntax (34)
            additional info: R004054 Invalid UTF-8 character found in string value
you need to do the following:
  1. Prepare your data in a text file but before pasting/typing in set encodingo to UTF-8, eg:

    dn: cn=Kłak Szósty,ou=1,ou=2,O=myorg,C=PL
    cn: Kłak Szósty
    sn: 12345678901
    objectclass: person
    objectclass: top

  2. Now, use any tool to encodee UTF-8 strings into Base64 to get something like:

    dn:: Y249S8WCYWsgU3rDs3N0eSxvdT0xLG91PTIsTz1teW9yZyxDPVBM
    cn:: S8WCYWsgU3rDs3N0eQ==
    sn: 12345678901
    objectclass: person
    objectclass: top

    and be sure that you've added second colon (:) before b64 values!!

    for Base64 encode I use N++ MIME plugin.
  3. Having "based" your strings, now convert the file to ANSI (save as ANSI). This does not change the way it looks on the screen
  4. Copy to the target system using "binary" transfer mode and use it as an input to ldap shell tools (ldapadd, ldapmodify)
It should now work nice and smooth. The tools of preference are still WinSCP and Notepad++ , of course (for encoding you need MIME plugin). Good luck!

No comments:

Post a Comment