Saturday, January 10, 2015

Accessing the encrypted file path in ImageNow

Intellectual property protection

Many years ago I took on a project to determine the protocol used to communicate between a Norand hand-held computer and a Norand 4820 printer. Norand called this protocol the Norand printer control protocol (NPCP). This was a turnkey proprietary solution with Norand controlling hardware and software. It was designed for the connection between a portable computer and printer for route accounting and sold primarily to bakery and beverage delivery companies.

As the Norand solution became older and the cost of functionally similar printers decreased well below the Norand product cost a demand opened up for third party printers that could be used in place of the Norand printers. The only problem was that the portable computers would only print using NPCP; a protocol that was unpublished.

At the time the world was starting to see the first free and open source software. Proprietary computing systems were giving way to DOS, the new Windows, Unix, XMODEM, and TCP/IP. The proprietary Norand protocol seemed like a dinosaur. After a bunch of digging through serial traces and obscure NEC uPD7810 microprocessor code I was able to create a Norand printer emulator. What became clear in this process was that Norand designers had intentionally obfuscated the NPCP protocol and the code that implemented it.

In many ways this is no different than HDMI encryption or software anti-piracy mechanisms. The intellectual property owners want to protect their property.



Content as a hostage

Wind forward to today and I have a customer that would like to extract all their documents from a document management system called ImageNow from Perceptive Software. They have added millions of pages to their ImageNow system over the years, but the annual cost for maintaining the software is prohibitively high and much lower cost and equally capable products are now available in the marketplace.

The documents, the “content”, in ImageNow is unquestionably owned by my customer. The software is installed on a Windows server, the data is stored in an SQL Server database, and the pages are stored in files in the Windows file system. A reasonable expectation for them would be to terminate their contract with Perceptive Software and be able to access their content without use of ImageNow or further cost from Perceptive. Unfortunately it seems this is not possible as Perceptive has encrypted one critical piece of information that describes the relationship between the documents stored in ImageNow and the pages stored in the Windows file system: the image file path.

I cannot think of any reason that this encrypted file path adds value for any customer of Perceptive. It appears to be in place to force end users to engage Perceptive for additional software and services to extract their documents. The content that is owned by my customer, not Perceptive, is effectively being held hostage by this encrypted file path.

Perceptive does sell an option called ImageNow Output Agent that can export mass documents and may help for exporting documents, but customers in this situation probably do not own this option.

A content lock is a fairly dangerous situation for users of document management systems. End users are left with a difficult or impossible situation if the manufacturer of the software goes out of business, gets acquired, or otherwise changes their plans to continue to support the software. My customer is in a little better position with Perceptive, but one that will require them to spend new money at a time they are trying to downsize their document management system expenditure.



Unlocking the encrypted file path in ImageWare

The good news for my customer and others in this situation is that there is a way to get their file path without engaging Perceptive or purchasing any other modules of ImageNow. They will, however, need the skills and time of a Javascript programmer.

ImageNow includes a scripting language called iScript. This is basically Javascript with some libraries specific to ImageNow. This scripting environment can access many or all of the objects stored in ImageNow.

A search of the ImageNow 6.x specific object documentation will find that the INLogicalObject provides information about the actual files stored in the file system. However, it does not contain any information about the file path. A little closer inspection under the hood of the object reveals that it does have a file path field and the value is not encrypted. It is a member of INLogicalObject. The following very simple example shows finding a single document and displaying its file type and unencrypted file path on the console.

 // get a single document  
 var results = INDocManager.getDocumentsBySqlQuery( "", 1, var more );  
 if ( results )  
 {  
     var doc = results[0];  
     doc.getInfo();  
     // get a single page for the document  
     var logob = INLogicalObject( doc.id, -1, 1 );  
     logob.retrieveObject();  
     printf( "file type: %s\n", logob.filetype ); // this member is in the documentation  
     printf( "unencrypted file path: %s\n", logob.filepath ); // this member is not in the documentation  
 }  
If you would like to tackle iScript, the documentation appears to be available with an ImageNow installation. There is a small amount of iScript information on the web including a nice introduction from Blaine Linehan of Wichita State University. He has a blog on programming with iScript for beginners.


19 comments :

  1. Thank you for this article, it is extremely helpful. I have one question if you don't mind. I am using your exact script and it works except, for the values for filetype and filepath, it just says undefined. I tried changing it to loop through 100 records just in case it was an issue with a record or two but I get the same issue. Do you have any idea what would be causing this? I can send you a screenshot of my script and the output if that helps. Thanks.

    ReplyDelete
    Replies
    1. I now ir has being a long time, but perhaps someone has the same problem. It is just about Exact name filePath and fileType, case sensitive.

      Delete
  2. I can send you a complete solution for this. As it's written right now, extract the DOC_IDs you need via SQL -> put the DOC_IDs to a csv file -> configure the script -> and off we go. The code above is old and you should be using the STL functions as this one does.

    Let me know if you still need something like this.

    ReplyDelete
    Replies
    1. That would be awesome if you could provide a solution for this. I'm currently working on finding out how to update the code to work and any help would be much appreciated. I am looking to do what you had mentioned - export the DOC_IDs along with the decrypted file path to a CSV file.

      Delete
    2. Hey Greg, I am in the same need. I can get it to return the file path, but I'm not sure how to loop through a csv with I scripts.Do you happen to still have the script you mention above?

      Delete
    3. I'm undertaking the same exact project for the same reasons your client has. Can you send the solution you used and any scripts/troubleshooting tips you can?

      thanks,

      Delete
    4. Thanks for the guidance. If you are still providing a copy of the solution, please send a copy. Thanks.

      Delete
    5. We are starting a migration from ImageNow and also discovered the dreaded file path encryption. Would it be possible to send a copy of our path decryption solution?

      I'm getting started with iScript so its going to take me a bit to progress from 'hello world!' to chewing through our document library ;-)

      Delete
    6. We would be interested in this solution as well.

      Delete
  3. This comment has been removed by the author.

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete
  5. In the iScript documentation, I can see a getDocumentsByVslQuery in INDocManager but I cannot find getDocumentsBySqlQuery. Can you point me to where I might find the correct syntax for that function?

    Also, when I your code using getDocumentsByVslQuery I get an error when I attempt to use "doc.getInfo();"
    Fail to interpret script - TypeError 1406: Variable getInfo is not a function type.

    Any suggestions that you can offer are appreciated greatly.

    ReplyDelete
  6. how can I get a copy of the solution from the above comments? Thanks

    ReplyDelete
  7. has anyone gotten anything on the scripts? I never heard back.
    Thanks

    ReplyDelete
  8. Has anyone got a solution for this? We are in the process of moving away from ImageNow and are stuck looking for the file path.

    The script shown here also gets me "undefined" values so I don't know what is going on there either.

    ReplyDelete
  9. I struggled for a while trying to work out how to follow the above example as I am not too familiar with ImageNow. Eventually I found and used the below. This worked to get all images, annotations and meta data out. Posted here in case it helps someone else.


    http://www.ensentia.co.uk/imagenowextract/

    ReplyDelete
  10. This comment has been removed by the author.

    ReplyDelete
  11. This is the complete solution for this. It gets the root file and subsequent pages. All other solutions I've found do not get anything other than the first page of the scanned document. Change your drawer to your own drawer name (btw). I hope this helps someone. Companies that lock down people's content really make me mad. Just use the intool.exe utility. It's located in the /bin folder of your installation. The call is: intool --cmd run-iscript --file yourfile.js

    var curDocId = 0;
    var more = true;
    // printf("curDocId : %s\n", curDocId );
    while (more) {
    var rulestext = "[drawer] = 'AR' AND [docID] > '" + curDocId + "'";
    var items = INDocManager.getDocumentsByVslQuery(rulestext, 1000, more, "DOCUMENT_ID");

    var start = items[0];
    var dataDesc = new Array();

    var headerDelim = "\03"
    var dataDelim = "\02";

    for (var line=1; line <= start; line++) {
    var temp = items[line].split(headerDelim);
    dataDesc[temp[1].toUpperCase()] = new Object();
    dataDesc[temp[1].toUpperCase()].idx = line - 1;
    dataDesc[temp[1].toUpperCase()].name = temp[1];
    dataDesc[temp[1].toUpperCase()].datatype = temp[2];
    }

    for ( ; line < items.length; line++) {

    var doc = new INDocument(items[line].split(dataDelim)[dataDesc["DOCUMENT ID"].idx]);
    doc.id = items[line].split(dataDelim)[dataDesc["DOCUMENT ID"].idx];

    doc.getInfo();

    var masterDocId = doc.id;
    var itCounter = 150;
    var i = 1;
    for( ; i <= itCounter; i++)
    {
    doc.getInfo();
    var logob = INLogicalObject( doc.id, -1, i );
    logob.retrieveObject();

    if(logob && logob.logobCount > 0)
    {
    var fp = Clib.fopen("c:\\inowoutput.txt", "a");
    var line = masterDocId + ',' + logob.id + ',' + logob.workingName + ',' + logob.filePath + '\n';
    Clib.fputs(line, fp);
    Clib.fclose(fp);
    }
    else
    {
    break;
    }
    }
    curDocId = doc.id;
    }


    //printf("curDocId : %s\n", curDocId );
    }

    ReplyDelete
  12. Here is the complete solution on stack overflow

    https://stackoverflow.com/questions/18613116/is-it-possible-to-combine-imagenow-with-javascript-and-php/54889041#54889041

    ReplyDelete