Issue w/Screen Scraper & Apache POI: Excel file locked by SS?

Hi everyone,

I have an old scrape that I've been using for >1 year that's begun to give me problems. The scrape generates 2 Excel files via Apache POI (3.6) when it is run; more specifically, SS uses POI to first create 2 Excel files and then writes in the data via POI once it has been scraped and processed. This scrape worked without a hitch until last week, but when I run it now, SS creates both Excel files but the second one is a locked, empty 0K file, so SS reports an error when I try to write data to this file. When I open this file in Excel, I'm told that it's locked by 'another user' and am given the option to open read-only/notify/cancel. Although I can open the Excel file as 'read-only' I obviously can't do anything with it.

The reason I'm suspecting that this is a SS problem (vs. POI) is that when I try to delete the problem Excel file when SS is running, Windows tells me that I can't because someone else is using the file. When I close SS, I can delete the problem Excel file, which seems to indicate to me that SS is the 'another user' who didn't close the file properly through POI.

Is there any way I can try to troubleshoot this issue to see what's going on? Failing that, is there a workaround I could use to bypass this problem?

Thanks very much in advance for any help/assistance you can provide and have a great day!

Regards,
Justin

PS: I forgot to mention that I've found the same problem under SS 5.5 and 6.0; I also have tried to fix the problem by installing the scrape from a saved export on a new computer and also re-installing POI.

Issues With POI in SS

Justin,

Based on what you have said, I think the problem may lie in one of your streams not being closed correctly. I have run into similar problems with files
being locked by 'another user' when I forget to close an output stream. Usually this is due to an "end" script that is set to run after the scrape
completes, but then for some reason the scrape is stopped mid-run (and the end script doesn't get run).

One solution is to set the end script to run "Always at the end" so that it will be run even if the scraping session is stopped. If you are
creating the file and writing to it all in one script, be sure to wrap the code in a try/finally block so the output stream always gets closed.

  FileOutputStream fileOut = null;
  try
  {
    fileOut = new FileOutputStream("myFile.xls");
    HSSFWorkbook wb = new HSSFWorkbook();
    /* Add the data */
    wb.write(fileOut);
  }
  finally
  {
    if(fileOut != null)
      fileOut.close();
  }

That should solve the problem of 'another user' locking the file. It could also solve the 0K issue since writing data to a stream without closing it or flushing it
gives no guarantee about when, or if, the data actually gets written to the disk.

If you are still unable to solve the problem, could you please either post your code dealing with POI (along with the error SS gives you) or email me the scrape
so I can take a closer look at what might be going on?

Thanks,
Mike

Thanks - great advice!

Hi Mike,

Thanks for your reply. Based on what you wrote, I revisited my code and discovered the script that wrote out results from SS to the Excel file via POI was choking on one of the SS variables, which had been recently renamed. As a result, SS ran through the script but POI couldn't close the file properly because it was missing data. Once I restored the variable to its original name, the scrape started working again...guess I'm eating some crow! ;)

That said, I really appreciate your advice regarding try/finally blocks; I'm definitely going to implement this in future code.

Thanks again!
Justin