utf-8 problem?

Issue #48 resolved
António Ramos created an issue

when reading this string "PORCELANA PARA MAÇARICO DE IGNIÇ." from a table column in my iseries

i get this from a nodejs script

PORCELANA P/MAÇARICO DE IGNIèX"

can you help understand what is wrong with my code ?

var db2i = require("idb-connector"); var clc = require("colors-cli"); const express = require('express') var bodyParser = require('body-parser'); const app = express() const port = 8000 const error = clc.red; const warn = clc.yellow; const notice = clc.green; app.use(bodyParser.json()); // support json encoded bodies app.use(bodyParser.urlencoded({ extended: true })); // support encoded bodies app.post('/db2query', (req, res) => { try { var dbconn = new db2i.dbconn(); dbconn.setConnAttr(db2i.SQL_ATTR_DBC_SYS_NAMING, db2i.SQL_TRUE); dbconn.conn("*LOCAL", "xxx", "xxx"); // connect to local database let query = req.body.query; let key = req.body.key; if (key == "xxxxx") { var stm = new db2i.dbstmt(dbconn); stm.exec(query, function (result) { res.send(result) stm.close() console.log(notice(new Date()+" Executada query\n"), (result)); }) } else { res.send("error") console.log(error(new Date()+" Tentativa ilegal de executar \n"), (query)); } res.end } catch (err) { console.log(error(new Date()+" Erro no script")); res.end("erro") } }) app.listen(port, () => console.log(NodeJs proxy listening on port ${port}!))

thank you

Comments (11)

  1. mengxumx Account Deactivated

    @ramstein74

    Do you have an url that I can access to recreate the issue?

    I am not sure if it is a HTTP encoding issue. I am not familiar with express, but you may try to add a HTTP respond tag to set the correct encoding like

    res.set('Content-Type', 'text/html;charset=utf-8');
    res.send(result);
    stm.close();
    
  2. mengxumx Account Deactivated

    Hello @ramstein74 , I checked the code and found that idb-connector only expands the buffer for wide characters --> https://bitbucket.org/litmis/nodejs-idb-connector/src/master/src/db2ia/dbstmt.cc#lines-1927

    Using Kevin's code in https://github.com/ibmdb/python-ibmdb/issues/359, I can recreate the truncated issue if I use the char column to store the string --> create or replace table xumeng.english(c char(10) ccsid 37);

    But if I use the nchar column, there is no error --> create or replace table xumeng.english(c nchar(10) ccsid 37);

    If possible, would you try to query the string from a nchar column? If it resolve the problem, then it is the same issue as the python ibm_db.

  3. António Ramos reporter

    i´m using tables not creating ones. I´m trying to modernize my erp with some nodejs code.

    I created a table and a field c with len 10 i pasted "ÇÇÇÇÇÇÇÇÇÇ"

    if i query it i get only "ÇÇÇÇÇ" only five instead of 10 Ç according to Kevin Adler

    "The problem is that ibm_db uses the returned column size information to determine the size of the buffer to allocate on the SQLBindCol. In this case a column size of 10 will result in a 10-byte buffer being allocated. However, converting to UTF-8 will see the byte length of the column increase to greater than the size of the buffer. A 'Ç' is 1 byte in single byte EBCDIC code pages, but 2 bytes in UTF-8. If the data expands beyond the buffer size, the result will be truncated and the total size of the data will be returned in the indicator."

    so a Ç takes 2 bytes and i get only the first 5 Ç because your driver thinks it already have the 10 bytes.

    Is it correct ? How can you help me solve this issue and get the 10 Ç chars inside my field of len 10 ?

    regards

  4. mengxumx Account Deactivated

    Hello @ramstein74 ,

    The only concern is that bigger buffer requires more memory usage.

    Anyway, I have updated idb-connector to v1.1.5 to use a bigger buffer for those Unicode characters. You can upgrade it to verify that.

    code change --> src/db2ia/dbstmt.cc

  5. mengxumx Account Deactivated

    1) Not sure if your project use the global npm path or the local one. It depends on how you install idb-connector (-g or not). Would you check the effective version of nodejs and idb-connector and ensure the upgraded idb-connector is the one used by your code?

    2) Would you try to run my test code?

    const db2i = require("idb-connector");
    const dbconn = new db2i.dbconn();
    dbconn.conn("*LOCAL");
    const dbstmt = new db2i.dbstmt(dbconn);
    const wchar = "ÇÇÇÇÇÇÇÇ";
    dbstmt.exec("SELECT * FROM (VALUES CHAR('" + wchar + "')) AS X (C)", (rs) => {
      console.log("[char output] " + rs[0].C);
      console.log("[char expect] " + wchar);
      dbstmt.close();
    });
    

    Below is the test results between 1.1.4 and 1.1.5 on my system. You can see that 1.1.4 truncate half of the input while 1.1.5 not. (my terminal does not support the encoding but it does not matter)

    bash-4.4$ npm -g i idb-connector@1.1.4
    [idb-connector] Success: "/QOpenSys/pkgs/lib/nodejs10/lib/node_modules/idb-connector/lib/binding/Release/node-v64-ibmi-ppc64/db2ia.node" is installed via remote
    + idb-connector@1.1.4
    updated 1 package in 10.019s
    
    bash-4.4$ node test.js
    [char  output] ÇÇÇÇ
    [char  expect] ÇÇÇÇÇÇÇÇ
    
    bash-4.4$ npm -g i idb-connector@1.1.5
    [idb-connector] Success: "/QOpenSys/pkgs/lib/nodejs10/lib/node_modules/idb-connector/lib/binding/Release/node-v64-ibmi-ppc64/db2ia.node" is installed via remote
    + idb-connector@1.1.5
    updated 1 package in 9.428s
    
    bash-4.4$ node test.js
    [char  output] ÇÇÇÇÇÇÇÇ
    [char  expect] ÇÇÇÇÇÇÇÇ
    
  6. Log in to comment